OCR Not happening (Resolved)
PostPosted:Sat Sep 29, 2012 4:28 am
Open KM v5.1.10 build 7564
No error listed, but do get one if I change the tesseract command line. Tesseract is working when I manually execute it against tif's.
Running on windows 2008 r2, 8gb ram, 2 processor.
Increased JVM memory, -Xms512m -Xmx1024m
Mostly trying to get image based PDF scans to OCR like they do on command line after converted to image. All my previews are working and my PDF conversions of documents. Convert seems fine, image uploads work. Just can't get a PDF upload to index for searches when it's a scan rather than a text embedded PDF.... Thanks for any help..
The following related options are set in this manner,
No error listed, but do get one if I change the tesseract command line. Tesseract is working when I manually execute it against tif's.
Running on windows 2008 r2, 8gb ram, 2 processor.
Increased JVM memory, -Xms512m -Xmx1024m
Mostly trying to get image based PDF scans to OCR like they do on command line after converted to image. All my previews are working and my PDF conversions of documents. Convert seems fine, image uploads work. Just can't get a PDF upload to index for searches when it's a scan rather than a text embedded PDF.... Thanks for any help..
The following related options are set in this manner,
Code: Select all
system.dwg2dxf string
system.ghostscript.ps2pdf string
system.imagemagick.convert string C:\ImageMagick-6.7.9-Q16\convert.exe
system.keyword.lowercase boolean false
system.login.lowercase boolean false
system.maintenance boolean false
system.ocr string C:\Tesseract-OCR\tesseract.exe ${fileIn} ${fileOut}
system.openoffice.dictionary string C:\OpenOfficeDictionary\dict-en.oxt
system.openoffice.path string C:\Program Files (x86)\OpenOffice.org 3
system.openoffice.port integer 2002
system.openoffice.server string
system.openoffice.tasks integer 200
system.pdf.force.ocr boolean true
system.previewer string zviewer
system.readonly boolean false
system.swftools.pdf2swf string C:\SWFTools\pdf2swf.exe -T 9 -f ${fileIn} -o ${fileOut}