During Text Extractor - How to skip new line characters

OpenKM has many interesting features, but requires some configuration process to show its full potential.
Forum rules
Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
Post Reply
JavaDev
Fresh Boarder
Fresh Boarder
Posts: 12
Joined: Thu Apr 09, 2015 12:44 pm

During Text Extractor - How to skip new line characters

Post by JavaDev » Mon Nov 14, 2016 9:17 am

Hi,

We are using OpenKM v 6.2.5
In Content Search we find that if a word is spans 2 lines using hyphen at the end, then that word is not searchable.

Ex:
Line 1: My SSN is 1234-5678-
Line 2: 1234.

So when we try to search for "1234-5678-1234" then the text does not match, because during extraction the extractor adds new line character after the hyphen.

So is there a way to resolve this issue and get the content search working in above scenarios.

jllort
Moderator
Moderator
Posts: 9380
Joined: Fri Dec 21, 2007 11:23 am
Location: Sineu - ( Illes Balears ) - Spain
Contact:

Re: During Text Extractor - How to skip new line characters

Post by jllort » Wed Nov 16, 2016 10:35 am

About what mime-type are talking about.
Is into the document a lot of text, or few lines ?

pavila
Moderator
Moderator
Posts: 3016
Joined: Tue Dec 11, 2007 6:02 pm
Location: Alicante, Spain
Contact:

Re: During Text Extractor - How to skip new line characters

Post by pavila » Wed Jan 18, 2017 6:43 am

You can implement your own custom text extractor. Take a look at the implementation of any of them.

Regards.

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest