Page 1 of 1
Keyword slow down import?
PostPosted:Tue May 09, 2017 1:19 pm
by openkm_user
Hi,
We have about 6 million documents currently and there are queries that are taking a lot of time to execute slowing down the import process. One of them involves Keywords, does Keyword slow down importing? For each document depending on the document name a keyword is set and permission is granted depending on certain condition (document name again).
I am not sure where I read but in one of the threads it was mentioned that Metadata is a better option than Keyword (or is using Keyword a very bad option?).
Thanks!
Re: Keyword slow down import?
PostPosted:Tue May 09, 2017 4:05 pm
by openkm_user
Code: Select allselect COUNT(DISTINCT(NKW_KEYWORD)) from OKM_NODE_KEYWORD
This returns 24, documents will have one of these 24 keywords only added to it. So we aren't giving unique keyword for each of the documents.
Re: Keyword slow down import?
PostPosted:Fri May 12, 2017 6:52 pm
by jllort
Very bad idea using keywords in this scenarios. Keyword might going right for small companies, 5-10 users what they will not create a lot of them or with a controlled dictionary ( like thesaurus ) anyway I do not suggest it. It's always better for several reason - what I will explain now for not extending so much the answer - use metadata, with metadata you can do the same as using keywords or categories and something more. But for performance reason we encourage do not use it, take in mind at dashboard we are drawing a tag cloud, with a tag cloud of 25K keywords can take several minutes the browser stalled working.
Do you have 6 milions docs into OpenKM community version ? is that your scenario ? What OpenKM version are you using.
Re: Keyword slow down import?
PostPosted:Fri May 12, 2017 8:02 pm
by openkm_user
Yes, we have 6 million documents in OpenKM Community 6.3.3. We periodically purge dashboard table since it is not required for our purpose, we use OpenKM like a container and access everything through REST API.
If Keyword is absolutely a bad idea, I read somewhere you mentioned that Keyword can be converted to Metadata, can you please let us know how to do?
Re: Keyword slow down import?
PostPosted:Sat May 13, 2017 11:42 am
by jllort
The best option should be a crontab task or scripting task what iterate across all the repository, getting the keywords and converting to metadata, and removing keyword. Consider these URL as a starting point:
https://docs.openkm.com/kcenter/view/ok ... rsal-.html
https://docs.openkm.com/kcenter/view/ok ... etChildren ( list of documents from some folder ) -> Document will return keywords associated
https://docs.openkm.com/kcenter/view/ok ... oveKeyword ( remove the keyword )
https://docs.openkm.com/kcenter/view/ok ... tiesSimple ( set metadata )
https://docs.openkm.com/kcenter/view/ok ... field.html ( single input field is enought for your case ).
6 milions docs is something very big, in this kind of scenarios must consider a lot of things. You must tunning database sure from up 1 milion docs. How do you catalog files -> set the folder ?
Good idea is to separate application server from database server. Investigate configuration performance for writing, and cache.
Are you using security in your repository or only a single user administrator ( the best scenario is removing any kind of security in the repository and use only administrator conection, but obviously that is better if you do it from the begining ).
Re: Keyword slow down import?
PostPosted:Tue May 16, 2017 10:15 am
by openkm_user
Thanks, I will take a look at all these.