Page 1 of 1

How Can I Set An Alternative Text Extractor?

PostPosted:Tue Feb 28, 2023 7:23 am
by jrdavid
Hello, I wondered why my searches yield zero results.
When I check the text extraction in utilities I see that I use "com.openkm.extractor.AbbyTextExtractor" and there are no keywords extracted. I guess it is somehow related to AbbyyFinereader.
There is nothing related to Abbyy on that machine. I do not know why the Abby Extractor is configured. I read that there are alternative extractors but I don't see how to configure them correctly.

In general, I do not need OCR as all PDFs are OCRed by AbbyyFineReader on another machine.
Of course, text extraction from other file types would be interesting (word, excel).

Thanks in advance.

Re: How Can I Set An Alternative Text Extractor?

PostPosted:Mon Mar 06, 2023 8:40 am
by jllort
Take a look here https://docs.openkm.com/kcenter/view/ok ... actor.html here's is shown how to create it

Also, the classes in this package may help you https://github.com/openkm/document-mana ... /extractor