Page 1 of 1

Difference performing a full text search in Lucene 2.4 and Lucene 3.1

PostPosted:Thu May 23, 2024 7:18 am
by alan_vallejo
I know that this is more a question for Lucene developers but I just post this just in case you can guide/help me. I also posted this message in StackOverflow (https://stackoverflow.com/questions/785 ... lucene-3-1).

Our actual java application works with Lucene 2.4 and we are triyng to migrate the part of indexing and storing files to OpenKM comunity version that works with Lucene 3.1 by default.

I'm facing same issues while performing the same search in two different versions of Lucene (2.4 and 3.1) using the same index. I think that the problem is with the evolution of the Standard Analyzer class which I'm using in both versions.

Text to search: "Company, S.A."

LUCENE 2.4:

ParseQuery result: text:"company sa" N results.

LUCENE 3.1

ParseQuery result: text:"company s.a" 0 results (expected same results that in version 2.4)

The funny thing is that when I'm searching using Lucene 2.4 it returns the results that I'm expecting while when using Lucene 3.1 version it doesn't.

I've searched how the phrase search works in Lucene and I've learned that when Lucene builds the index of the document it keeps the information of the words that belong to the document and the position of them in it. So, I could understand that the analyzer has changes in version 3.1 and the way that the terms are extracted is diferent but when it extracts the terms it should work the same way!

Another thing that I don't understand is that when I perform a similar search (deleting the dots) both versions return the same results.

Text to search: "Company, SA"

LUCENE 2.4:

ParseQuery result: text:"company sa" N results.

LUCENE 3.1

ParseQuery result: text:"company sa" same N results that version 2.4

So when Lucene indexes the term "s.a" (in version 3.1) what the hell is doing with it and why is not positioning it after "company" term?

So, here are the questions for OpenKM developers. If I change in the configuration file configuring an anterior Lucene version what I'm changing really? The way that the analyzer works? The way that the index is build? The speed of the search?

Re: Difference performing a full text search in Lucene 2.4 and Lucene 3.1

PostPosted:Fri May 24, 2024 8:31 am
by pavila
Sorry, only questions about OpenKM.

Re: Difference performing a full text search in Lucene 2.4 and Lucene 3.1

PostPosted:Fri May 24, 2024 10:48 am
by alan_vallejo
Ok, thanks! no worries.

I've searched the code, and I've realised that when you change the lucene version everything works with that version (QueryParser, IndexWritter...)

Lucene 3.0 works in the same way as Lucene 2.4 does. So I would downgrade to that version till I can make a new version of OpenKM with an earlier Hibernate, Hibernate Search and Lucene version.

Thanks for your time and for all the work you've done!