Page 1 of 1

content searching not working properly

PostPosted:Thu Aug 05, 2010 12:55 pm
by ibrahim
Hi

I have So far configured the Openkm4.0 with MySQL 5.0 and is working fine as per the requirement.

But, whenever am searching the document 'by content', the searching result showing only .txt files. The files with the extensions .doc, .docx, .xls, .xlsx, .pdf, etc are not showing in the result.

Have you got any ideas to solve this problem?

best regards, Ibrahim

Re: content searching not working properly

PostPosted:Thu Aug 05, 2010 3:36 pm
by jllort
very very strange ... normally the problems are that you're not making well the queries.

Do:
1- test document in our online demo demo.openkm.com and tell me if there runs, if not indicate me where you've upload the document and your query

Re: content searching not working properly

PostPosted:Fri Aug 06, 2010 12:19 pm
by ibrahim
Hi,

I have tested the same in your demo, it is working fine. But in my application it is searching only the .txt files.

I have uploaded the documents in the Taxonomy (okm:root). The query that am executing is as follows,
Code: Select all
/jcr:root/okm:root//*[@jcr:primaryType eq 'okm:void' or (@jcr:primaryType eq 'okm:document' and jcr:contains(okm:content,'ibrahim'))] order by @jcr:score descending 
Have you got any ideas?

best regards, Ibrahim

Re: content searching not working properly

PostPosted:Mon Aug 09, 2010 2:33 pm
by jllort
The only problem I could imagine is that for some reason the content of .doc has not been indexed. Which version of .doc are you using, in demo OpenKM 5.0 has better supporting office version formats than 4.X ( we've upgraded some libraries not still availables when we released version 4.X )

Re: content searching not working properly

PostPosted:Fri Aug 20, 2010 10:57 am
by ibrahim
Hi,

I have solved the the content searching problem with Microsoft office files. But still am not able to search the contents of a .pdf file. Whenever am uploading a pdf file, getting a warning at server side:
Code: Select all
org.apache.jackrabbit.extractor.PdfTextExtractor - Failed to extract PDF text content 
Do you have any ideas to solve this problem?

best regards, Ibrahim

Re: content searching not working properly

PostPosted:Sun Aug 22, 2010 8:22 am
by jllort
has that pdf some restriction like password, copy dissabled or similar ... could you try with other pdf created by you without any restriction ... probably the problem be there.

Re: content searching not working properly

PostPosted:Thu Aug 26, 2010 10:13 am
by ibrahim
I have created my own .pdf file and tried.... but the same problem occurs. moreover I have tried in your demo also, it is working fine.

now am not getting the problem of .pdf files.

do you have any suggestions related to this problem?

Re: content searching not working properly

PostPosted:Fri Aug 27, 2010 8:28 am
by jllort
I must see that pdf to understand what happens in version 4.X