If users are adding PDF documents to OpenKM is there a good/simple way to run an application against the document first?
I already have a nice effective program that converts an image based PDF to a searchable PDF.
http://www.scantopdf.com/en/product/fil ... _line.aspx
I notice the great feature already in OpenKM that detects that a PDF doesn't have a text layer and that is when it performs conversion and OCR against a PDF. Instead of that could it run this program?
It is a bit better (although not free) but since then the PDF is not only indexed in OpenKM searches but when you use the OpenKM embedded file preview you can search for the text within the document and it highlights the findings.
I'm guessing the best way would be to build another extractor class like "com.openkm.extractor.PDFToSearchablePDF" and then just change the class in the system to use this one. But was curious if it is already possible to simply link a conversion process on upload of files of a type?
In the meantime I suppose I'll start investigating building the new extractor class...
Thanks for any input..
I already have a nice effective program that converts an image based PDF to a searchable PDF.
http://www.scantopdf.com/en/product/fil ... _line.aspx
I notice the great feature already in OpenKM that detects that a PDF doesn't have a text layer and that is when it performs conversion and OCR against a PDF. Instead of that could it run this program?
It is a bit better (although not free) but since then the PDF is not only indexed in OpenKM searches but when you use the OpenKM embedded file preview you can search for the text within the document and it highlights the findings.
I'm guessing the best way would be to build another extractor class like "com.openkm.extractor.PDFToSearchablePDF" and then just change the class in the system to use this one. But was curious if it is already possible to simply link a conversion process on upload of files of a type?
In the meantime I suppose I'll start investigating building the new extractor class...
Thanks for any input..
