Any thought about supporting (or maybe it already does) ... Multiple OCR engines on import would export the text for indexing.
Upload Document / Image
Phase I Tesseract 3 OCR Processes
Phase 2 Tesseract 2 OCR Processes
Phase 3 Cuneiform OCR Processes
Phase 4 Abby OCR Process
Phase 5 Generic OCR handler Process
Just giving an extreme example but many of the OCR processors have strength in some areas/formats and weakness in others, would be nice if OpenKM supported multiple stages of OCR processing, maybe definable by file type. So JPG would only go through Cuneiform but TIFF would go through 3 stages .. etc, etc...
Upload Document / Image
Phase I Tesseract 3 OCR Processes
Phase 2 Tesseract 2 OCR Processes
Phase 3 Cuneiform OCR Processes
Phase 4 Abby OCR Process
Phase 5 Generic OCR handler Process
Just giving an extreme example but many of the OCR processors have strength in some areas/formats and weakness in others, would be nice if OpenKM supported multiple stages of OCR processing, maybe definable by file type. So JPG would only go through Cuneiform but TIFF would go through 3 stages .. etc, etc...
