Page 1 of 1

OpenKM taking up constant CPU

PostPosted:Thu Feb 09, 2017 4:54 pm
by openkm_user
Hi,

Even though there is no activity being performed in OpenKM, I still see OpenKM java process is still constantly taking 15% CPU, please explain why?

Server is Xeon with 16GB of RAM.

However 2 days ago, we have imported about 100,000 documents into the system, is there some more import process that is using up the CPU?

Thanks!

Re: OpenKM taking up constant CPU

PostPosted:Sat Feb 11, 2017 9:42 am
by jllort
After any document is created goes into "text extraction pending queue", periodically set by crontab task named "text extractor worker" it's started a thread what extract text from documents for indexing them by content. You can configure how aggressive is this process, by defaul if you have not changed the parameters associated with it, the application will use a single thread ( single core )

Re: OpenKM taking up constant CPU

PostPosted:Sun Mar 26, 2017 2:31 pm
by openkm_user
Thank you for your prompt response. I am sorry for the late reply. We have imported about 500,000 documents into our repository. What you say makes sense.

However, we really don’t need to extract text from imported content and so we have stopped the ‘text extractor’ in the crontab. Please can you let us know how to NOT put the imported content in the text extraction queue? We just want to import it into the db?

We notice that openkm tomcat is constantly using about 7% CPU. After some investigation we noticed that com.openkm.core.updateinfo is the one using this CPU. What exactly does this method do? It appears we can set updateinfo to false in the config file and it will not run BUT what is the purpose of this updateinfo? Is this the same text extractor that you have mentioned or is this something different.

Thanks again :).

Re: OpenKM taking up constant CPU

PostPosted:Sun Mar 26, 2017 6:45 pm
by jllort
The UpdateInfo class connect to OpenKM server and send notifications from our side to the application. Notifications like "it has been released a new version you can upgrade your installation".

Take a look at this OpenKM section https://docs.openkm.com/kcenter/view/ok ... 0Nodetypes you should be interested in OKM_NODE_DOCUMENT and specially in two fields:
NDC_TEXT_EXTRACTED ( when the value is T means had been processed by the queue, when the value is F it means is in the queue, changing the value you can add or remove files from the queue ).
NDC_TEXT what contains the text extracted by the text extractor process. You can update this field manually or with the automatic process, but take in mind if you change from the database point of view is not visible from the search engine ( you have not changed from the api, and from the lucene search engine point of view nothing has changed ). If you modify any field from database view and you want to propagate changes to search engine, must reindex the whole repository https://docs.openkm.com/kcenter/view/ok ... dexes.html ( lucene indexes option. In your case with 500K files, can take some hours to be completed )

Re: OpenKM taking up constant CPU

PostPosted:Mon Mar 27, 2017 4:04 pm
by openkm_user
Thanks for the detailed explanation, can you please let me know how to turn off com.openkm.core.updateinfo? Is there any configuration parameter available in Administration?

Re: OpenKM taking up constant CPU

PostPosted:Tue Mar 28, 2017 6:41 pm
by jllort
At OpenKM.cfg set the parameter ( and restart the service )
Code: Select all
update.info=off

Re: OpenKM taking up constant CPU

PostPosted:Fri Mar 31, 2017 3:14 pm
by openkm_user
Thanks, first of all we tried setting update.info= false in openkm.cfg file but it did not work. The updateinfo continued to run and consume CPU. Then we modified the config file to manually set it to false and this worked. However the updateinfo() calls com.openkm.core.RepositoryInfo().run();. Internally this run() method calls runAs() which calls methods such as okmStats.getFoldersByContext(). What is the purpose of these okmStats methods? Is it used anywhere?

Re: OpenKM taking up constant CPU

PostPosted:Sat Apr 01, 2017 4:42 pm
by jllort
I think might be a crontab task, there's a crontab task name "Repository info" executed daily is used to calculate stats. Try stopping it.