• Zone OCR in multipage PDF

  • OpenKM has many interesting features, but requires some configuration process to show its full potential.
OpenKM has many interesting features, but requires some configuration process to show its full potential.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #52204  by pavel.petruska
 
In professional version 6.4.49 zonal OCR for PDF works correctly if the document has one page. I need to solve zonal OCR for multi-page PDF as well.

It occurred to me to define a separate prototype for each page and link it to different metadata items - I assume that a match is possible for the same document for multiple prototypes. However, during the detection test, the processing always reads only the first page.

What PDF settings / conversions can I make to process zone OCR for all pages of a document?

Regards
PP
 #52208  by jllort
 
For splitting pages, what seems is your issue, we do it usually with small customization, it depends on the type of the document. Sometimes also we integrate Chronoscan for its purposes, but without a more detailed description is not easy to suggest what is best for you.

You can take a look here https://www.youtube.com/watch?v=jYkRItZsBSo

I suggest use our contact form https://www.openkm.com/en/contact.html and ask there about your problem.
 #52261  by pavel.petruska
 
Thanks for your reply. You're right, this is not a scanning issue. We need to perform OCR on multipage PDF created by any text editor/PDF generator/scanner (using separate prototypes per page).
Please confirm, that this cannot be solved by standard configuration steps and there is some "small customization" needed thru the support.
 #52276  by jllort
 
This behaviour requires small customization ( we have done to other customers ) or the use of chronoscan. All it depends on the type of the documents. If you are a customer should do this question from the official support website and we'll ask there to share the documents to analyze and suggest several options to process them. Without taking a look at the documents I can not suggest what is better.

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.