Page 1 of 1
I can't search by content
PostPosted:Wed Dec 17, 2008 3:23 am
by Iwen
I install okm 3.0.
I import lots of PDFs(Chinese) which is scaned documents and dealed with OCR.
I can\'t search them by content?
But it\'s ok using okm 2.0
Why?:blink:
Re:I can't search by content
PostPosted:Wed Dec 17, 2008 12:08 pm
by pavila
Are the PDF composed of images or text? Can you post here a sample PDF?
Re:I can't search by content
PostPosted:Thu Dec 18, 2008 2:09 am
by Iwen
The PDF only composed by text.
When I open the pdf file using adobe reader, I can select text in it and search word in pdf.
Re:I can't search by content
PostPosted:Thu Dec 18, 2008 1:05 pm
by jllort
Put seomeone on OpenKM demo, all tell us the location and the search you\'re doing to test it.
Re:I can't search by content
PostPosted:Fri Dec 19, 2008 1:11 am
by Iwen
I login OpenKM demo using user9 and create a folder in My documents named Iwen which I put one file in. The file name is \"2.09 明达投资顾问有限公司简介.pdf\". I search content by \"明达\" which is behind \"2.09\". But no result.
I open this file in adobe reader, I can search \"明达\" in pdf.
Re:I can't search by content
PostPosted:Fri Dec 19, 2008 10:26 am
by pavila
OpenKM uses a generic index algorithm, and it should be tunned to work with chinese to get better search results. By the way, the document you mention is a PDF composed by images. It won\'t be indexed, only PDF with text can be indexed.
Re: I can't search by content
PostPosted:Thu Feb 24, 2011 6:17 am
by joako
So, I am wondering if a "searchable PDF" of a scanned image can not be searched for by content? If an OCR process is run on the PDF file I can search it in Acrobat and other applications and even select the text and copy & paste it around. I can provide a sample but right now the ones I am testing have sensitive data.
Re: I can't search by content
PostPosted:Mon Feb 28, 2011 10:36 am
by pavila
Sorry, but I need a sample PDF to check why the text extraction fails. If you documents have sensitive data, maybe you want to
contact us and become a customer.
Re: I can't search by content
PostPosted:Tue Mar 01, 2011 12:10 am
by joako
I've scanned a document that couldn't have any private data in it and I've posted it over here:
http://forum.openkm.com/viewtopic.php?f=4&t=4581
Re: I can't search by content
PostPosted:Wed Mar 02, 2011 8:15 am
by pavila
Answered in the other thread.