Page 1 of 1

future of thesaurus, KEA, and synonyms ?

PostPosted:Wed May 22, 2013 12:20 am
by Bruno
Automatic key extraction was removed from OpenKM Community 6.2.4 and OpenKM Professional 6.2.15 due to obsolete and removed libraries
(http://wiki.openkm.com/index.php/Automa ... ll_example)

Will something else replace KEA to automatically add keywords from a controlled vocabulary (thesaurus) ?
What libraries were obsolete and removed ?

Can you tell us more about how you envision the future use of thesauri in OpenKM ?
I read (http://forum.openkm.com/viewtopic.php?f ... cet#p20621) you are thinking about faceted searches, which sound like a great and industry-standard improvement (and seems possible with Lucene, i.e. http://lucene.apache.org/core/4_3_0/fac ... guide.html)

It seemed to me, from previous forum answers (http://forum.openkm.com/viewtopic.php?f ... nym#p16867), that OKM Thesaurus based on SKOS cannot make use of synonyms, though there seems (from online searches) to be various ways to implement this. Also, the product features page http://www.openkm.com/en/overview/features.html mentions :
Search by synonyms
Stemming, stopwords support and synonyms support
How can this apply to searches while not being managed by thesaurus ? Is there any OpenKM documentation available about these features ? I couldn't find any in wiki nor forum.

I know building a thesaurus is hard work, but as most of the ones available (like Agrovoc) include various sorts of synonyms (or "use for", "Preferred terms", etc...), and as suggested before in the next spanish quote, I will go step by step but get inspiration and structure from previously existing and complex ones, so I need to know what to drop out and what to keep, while thinking of future OKM evolutions at the same time if possible (hence previous questions on the future of KEA) :
yo empezaría con un fichero xml y un editor plano creando una estructura muy básica y luego lo editas con el protegee. Esta forma de trabajar a mi me ha funcionado mejor que empezar el fichero directamente con el protegee.
http://forum.openkm.com/viewtopic.php?f ... urus#p6944

If all the traditional relations (including synonyms of some sort) used in thesauri could be used to make automatic extraction of keywords (or help with manual selection of keywords), that would seem like the best advances in knowledge management systems like OpenKM. And from my readings (while being no technical expert), I suppose this can be done.

Anyway, at this point, if I plan to build a thesaurus (with Protégé or maybe a dedicated installation of VocBench http://aims.fao.org/tools/vocbench-2/background that manages the Agrovoc thesaurus), can I skip synonyms at all (and other relations like "use for"...) if it is not planned to be integrated in OpenKM ? That is to say, just work with hierarchical structures and basic relations like Broader term (BT) / Narrower term (NT) or can I still use a complete thesaurus and OpenKM will skip parts it won't use ?

Re: future of thesaurus, KEA, and synonyms ?

PostPosted:Wed May 22, 2013 3:19 pm
by pavila
Thesaurus feature is still present. The only removed feature was Automatic Keyword Extraction because was based on deprecated and missing libraries. The domain which hosted the Maven repository of these libraries does not exists any more.

Would be nice if anyone port the old KEA feature to use a more recent (and maintained) libraries.