• Error extracting text from a file

  • OpenKM has many interesting features, but requires some configuration process to show its full potential.
OpenKM has many interesting features, but requires some configuration process to show its full potential.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #54806  by MarcoOliveira
 
Even last Friday he was able to submit documents and their content was extracted.

If I did the test in Menu Administrator -> Utilities -> Check text extraction I would also be successful!

Today, Monday, I can no longer do one thing or the other and nothing has changed. And I'm using the same documents.

The log is attached.
Thank you!


Database.
https://prnt.sc/3slUAW8hd1O9

VERSION: 6.3.12 CE
Attachments
(119.37 KiB) Downloaded 787 times
 #54811  by jllort
 
You should read the logs slowly because there usually it is shown most of what you need to solve it.

The error seems clear, at least what you have missing:
Code: Select all
Failed loading language 'fra'
Error opening data file /usr/share/tesseract-ocr/4.00/tessdata/spa.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata"
directory.
Failed loading language 'spa'
Tesseract Open Source OCR Engine v4.1.1 with Leptonica
Error in pixReadStream: Pdf reading is not supported
Error in pixRead: pix not read
Error during processing.
At leat you have not installed Spanish and french dictionary -> that's the first thing you must solve. If the problem with Error in pixRead: pix not read continues then should investigate what happens with it. Search in google for "Tesseract Error in pixReadStream: Pdf reading is not supported"

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.