Hi,
I dedicated several days to configure OpenKM. I would like to use the program to manage my documents at home. The OCR feature is critical as I would like the contents of all uploaded documents to be taken into account while searching. This is all.
I've installed OpenKM Community 6.3.2 under Debian Stretch 4.7.8-1 (2016-10-19) x86_64 GNU/Linux
I've installed tesseract 3.04.01
I've installed all required Java staff.
Below is the configuration that I performed in the administration tab in OpenKM.
In the log file I see the following message:
I dedicated several days to configure OpenKM. I would like to use the program to manage my documents at home. The OCR feature is critical as I would like the contents of all uploaded documents to be taken into account while searching. This is all.
I've installed OpenKM Community 6.3.2 under Debian Stretch 4.7.8-1 (2016-10-19) x86_64 GNU/Linux
I've installed tesseract 3.04.01
I've installed all required Java staff.
Below is the configuration that I performed in the administration tab in OpenKM.
Code: Select all
The OCR feature does not seem to be working. When I try the Tessaract over the command line I'm able to get results.registered.text.extractors= com.openkm.extractor.Tesseract3TextExtractor -l eng
system.ocr=/usr/bin/tesseract
system.ocr.rotate= 90;180;270;
system.pdf.force.ocr=TRUE
In the log file I see the following message:
Code: Select all
WARN com.openkm.extractor.RegisteredExtractors- Text extraction failure: Full text indexing of 'image/png' is not supported