Open Source Document Management System | OpenKM - Content Search not working for 6.3.0 for special characters

Content Search not working for 6.3.0 for special characters

Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.

5 posts

5 posts

Content Search not working for 6.3.0 for special characters

#31094 by vaibhavk
Fri Jan 30, 2015 5:10 am

Hi OpenKM Support Team,

We have OpenKM Community Edition 6.3.0 installed on our machine. Browsers used: Mozilla Firefox, IE

In the content search, we used below search criteria.

Searching a text 422.50 returned a set of documents having text 422.50(a)(2), however when I use the text 422.50(a)(2) or 422.50(a) it does not return any results.

Is there any configuration that we need to do so that content with special characters like (,;"''{[ are returned in the results or special characters are not supported in Search.

Note: The text extraction job has completed for all the documents that have been uploaded. We are searching for PDF and word documents mostly.

Let us know if we only support 'Search for any word' or 'Search for exact word'

Also as per community site, latest build for Community available is 6.3.1, But we could only find download link for OpenKM version 6.3.0. Please let us know if 6.3.1 is available for download too.

Regards,
Vaibhav

Username

vaibhavk

Rank

Fresh Boarder

Posts

Joined

Fri Jan 30, 2015 4:02 am

Re: Content Search not working for 6.3.0 for special characters

#31129 by jllort
Sun Feb 01, 2015 1:17 pm

Basically lucene search engine probably with default tokenizer is storing into the indexes as separated words. Normally default tokenizer goes right for almost people, but sometimes is interesting build your own or use other than default. Basically this classes take the text, and based on tokenizer split text in word, for example "some-text" will be separated in two words "some" and "text" because character "-" is considered as separator character.

I think you should use org.apache.lucene.analysis.WhitespaceAnalyzer what only considers white space as separator. And reindex whole repository to take it effect ( Go to administration -> Utilites -> rebuild indexes -> choose lucene indexes. ( for it before you must change the default analyzer, and restarted openkm )

Consider take a look here http://wiki.openkm.com/index.php/Indexing_configuration

Username

jllort

Rank

Moderator

Posts

12185

Joined

Fri Dec 21, 2007 11:23 am

Location

Sineu - ( Illes Balears ) - Spain

Contact

Re: Content Search not working for special characters

#40116 by vaibhavk
Fri Jul 17, 2015 12:06 pm

Hi,

We are using 6.2.5 OpenKM

i am uploading text document(.txt) with !#$%&'()+,-.0123456789 as a title

But when i am searching for the same it is giving error Please find the attached screen shot

So can you pleasse tell us what can be the problem
is there any limitation by OpenKM on speacial characters ' , ()

Please reply

Thanks

Attachments

OpenKM search.png (25.97 KiB) Viewed 3823 times

Username

vaibhavk

Rank

Fresh Boarder

Posts

Joined

Fri Jan 30, 2015 4:02 am

Re: Content Search not working for 6.3.0 for special characters

#40123 by jllort
Sun Jul 19, 2015 10:17 am

Take in mind some characters are reservated and passed to lucene search engine. About special characters also take in mind, when text goes into lucene it passes across analyzer what really use only a couple of token for the indexing and other are discarted. The default analyzer can be changed for other or write your own. On almost cases default analyzer is right, but not always. Al depends on what you expect get from the search engine.

Username

jllort

Rank

Moderator

Posts

12185

Joined

Fri Dec 21, 2007 11:23 am

Location

Sineu - ( Illes Balears ) - Spain

Contact

Re: Content Search not working for 6.3.0 for special characters

#40210 by pavila
Fri Jul 31, 2015 12:32 pm

Please, try to reproduce the issue with a recent night build from http://integration.openkm.com/6.3/

Username

pavila

Rank

Moderator

Posts

3146

Joined

Tue Dec 11, 2007 6:02 pm

Location

Alicante, Spain

Contact

Page 1 of 1
5 posts

Return to “Configuration”

Display:

Sort by:

Jump to: