• OCR Timeout

  • Help us to improve OpenKM! Be part of the Open Source Community.
Help us to improve OpenKM! Be part of the Open Source Community.
Forum rules: Please, before asking something see the documentation wiki or use the forum search function.
 #15453  by Alexires
 
Hi all,

Would it be possible to build in a timeout for OCR processes? I find with some PDF's the OCR just hangs on a page while other PDF's are fine. If a timeout was built in, this would prevent server being under load waiting for OCR to finish (which it doesn't).

Ubuntu 10.10, OpenKM 5.1.9
 #15472  by jllort
 
I think we have added it in trunk pavila can confirm it.
 #15483  by pavila
 
This is implemented in the 5.1 branch, try 5.1.10 night build to test it. The timeout is hardcoded to 5 minutes.
 #15494  by Alexires
 
I think I fixed the problem by upgrading Cuneiform from 0.7.0 to 1.1.0 in Ubuntu via Aptitude. Good to know anyway. Thank you.
 #15519  by pavila
 
Depending on Cuneiform version and operating system, the program may fail. Cuneiform 1.0.0 and more recent version works fine, almosty in Ubuntu and Debian.
 #17205  by Alexires
 
Did this end up being included in 5.1.10? Can I change that timeout time?
 #17216  by jllort
 
As said pavila "The timeout is hardcoded to 5 minutes." can not be changed now from administration configuration I will add in our features ticket system http://issues.openkm.com/view.php?id=2215
 #17272  by Alexires
 
Having this in the admin configuration would be massively useful. As an example, for my system, a PID of the cuneiform process rarely runs for more than 10 seconds, so a timeout of 5 minutes is far too long (for my system). If I'm uploading many files, I need to sit there and watch htop and kill cuneiform processes that hang.
 #17283  by jllort
 
The temporary solution for you could be disabling ocr, and enabling before ... that will not index image files, but repository can be reindexed before.

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.