Page 1 of 1

how to use the tesseract

PostPosted:Sun Jan 11, 2009 3:00 am
by stanley
I wander how to use the tesseract.
where can I find how to use the tesseract with steps?

Re:how to use the tesseract

PostPosted:Mon Jan 12, 2009 9:48 am
by pavila
You have to install tesseract and imagemagick packages. If you use a Debian-based distribution it is as simple as:

$ aptitude install tesseract imagemagick

After that, you have to edit OpenKM.cfg and set the \"Config.SYSTEM_OCR\" parameter to the tesseract binary file. Restart JBoss and OpenKM will make an OCR of uploaded TIFF files.

See OpenKM sources (/src/es/git/openkm/extractor/TiffExtractor.java) for more info.

Re:how to use the tesseract

PostPosted:Thu Jan 15, 2009 12:44 pm
by stanley
I am a javer,and I don\'t know how to do with the tesseract sources and the imagemagick ,can you give me the steps?
thank you for your patients.

Re:how to use the tesseract

PostPosted:Thu Jan 15, 2009 7:34 pm
by pavila
If you use Debian / Ubuntu, you can install these programs easilly:

$ aptitude install tesseract imagemagick

Re:how to use the tesseract

PostPosted:Fri Jan 16, 2009 2:41 am
by stanley
but I am a window user, can you show me the way in windows xp?
thanks .

Re:how to use the tesseract

PostPosted:Fri Jan 16, 2009 3:27 am
by stanley
I use windows system,and I get the tesseract binary ,and I download the imagemagick.
how to make them work together,.

Re:how to use the tesseract

PostPosted:Sun Jan 18, 2009 10:22 pm
by pavila
I have never tested tesseract OCR integration in Windows, and I\'m not sure if it works.

Re:how to use the tesseract

PostPosted:Thu Sep 03, 2009 10:22 pm
by djdifulvio
Question, I am new to this project and so far its been great. However, I followed your directions and I still am showing the \"WARN [TiffTextExtractor] Undefined OCR application\" when I try to upload TIFFs...

I am using Ubuntu with a 2.6.28-15 kernel. I have installed everything and corrected the path for the OCR software which is \"system.ocr=/usr/bin/tesseract\"

Do I have to do anything with the TiffTextExtractor.Java? I did not make this install, just downloaded the pre-made version.

Thanks,
djd

Re:how to use the tesseract

PostPosted:Thu Sep 03, 2009 10:26 pm
by djdifulvio
Nevermind, noob problem, did not remove the \"# \" before the field, works fine now. :-P