• Tesseract output file not found / not picked up by OpenKM

  • Problems with installing OpenKM? No problemo, the solution is closer than you think.
Problems with installing OpenKM? No problemo, the solution is closer than you think.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #22858  by anyonebutnoone
 
Hello,

I have the following in my logfile
Code: Select all
2013-05-04 13:40:00,492 [Thread-601] WARN  com.openkm.extractor.Tesseract3TextExtractor - IO exception executing command: /usr/local/bin/tesseract /home/kmadmin/openkm/tomcat/temp/okm8891882979629288432.gif /home/kmadmin/openkm/tomcat/temp/okm7728462093100462371 -l deu
java.io.FileNotFoundException: /home/kmadmin/openkm/tomcat/temp/okm7728462093100462371.txt (No such file or directory)
I am running Tesseract 3 which works fine on comandline. Also if i set the output file to OPENKMPATH/tomcat/temp/xxxxx
OpenKM is Version: 6.2.3 (build: 7945) Community

Can someone give me a hint on how to fix this?

Edit: Or a hint on where to look else for the Problem!
 #22900  by anyonebutnoone
 
Hmm, i am getting the error again, tesseract works fine on the comandline

my settings are
Code: Select all
system.ocr	String 	/usr/local/bin/tesseract ${fileIn} ${fileOut} -l deu
Code: Select all
registered.text.extractors	List 	
org.apache.jackrabbit.extractor.PlainTextExtractor
org.apache.jackrabbit.extractor.MsWordTextExtractor
org.apache.jackrabbit.extractor.MsExcelTextExtractor
org.apache.jackrabbit.extractor.MsPowerPointTextExtractor
org.apache.jackrabbit.extractor.OpenOfficeTextExtractor
org.apache.jackrabbit.extractor.RTFTextExtractor
org.apache.jackrabbit.extractor.HTMLTextExtractor
org.apache.jackrabbit.extractor.XMLTextExtractor
org.apache.jackrabbit.extractor.MsOutlookTextExtractor
com.openkm.extractor.PdfTextExtractor
com.openkm.extractor.AudioTextExtractor
com.openkm.extractor.ExifTextExtractor
com.openkm.extractor.MsOffice2007TextExtractor
com.openkm.extractor.Tesseract3TextExtractor
Am i missing something?

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.