• jpeg's and OCR

  • OpenKM has many interesting features, but requires some configuration process to show its full potential.
OpenKM has many interesting features, but requires some configuration process to show its full potential.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #17026  by domi
 
Hi @all,

finally I got a working OCR-integration. I use tesseract at the moment, because for me it has a better support for german language.
I installed OpenKM 5.1.10 on ubuntu x64.

But with both extractor's I can't get jpeg's to be ocr'd. All other supported formats get ocr'd, but I tried everything, I can't get it working with the jpg-format. :cry:

Executed in a shell it works with both tesseract and cuneiform.

Any possibility to debug the OCR-mechanism?

Thx!

Domi
 #17055  by domi
 
Hi and thanks for response, but sorry, still not working :( Neither with jpg nor jpeg ...
 #17080  by pavila
 
Have you installed the last night build?
 #17131  by shaardu
 
can you please post the "exact steps" that you took to get your ocr work?? cos I am trying from past one month, it doesnt work....please let us know the exact steps1

thanks
 #17132  by domi
 
Hi, finally I got it working with jpegs too.

There are just a few steps:
  • apt-get install ImageMagick
  • apt-get install cuneiform
  • apt-get install libreoffice
In Admin-Panel set:
  • system.imagemagick.convert = /usr/bin/convert
  • system.ocr = /usr/bin/cuneiform ${fileIn} -o ${fileOut} -l ger
  • system.openoffice.path = /usr/lib/libreoffice
That's all I made for finally got it working.

Don't know which OS you have installed, but this is how it works (for me) with Ubuntu 12.04

Good luck!

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.