• CPU 100% load

  • Problems with installing OpenKM? No problemo, the solution is closer than you think.
Problems with installing OpenKM? No problemo, the solution is closer than you think.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #41017  by stefbort
 
Hi all.
I installed OpenKM 6.3.1 and after the uploading around 200GB (=18.000 files) I have a core of CPU to 100% load everytime. A screenshot to cpu/process status:
img1.png
img1.png (101.27 KiB) Viewed 4807 times
I tried to wait and after 5 day nothing cahnge: way? Whath is wrong?

A little deeping about my installation:
- i tried 3 times the installation and the problem is constant
- when i install and upload the first time all is OK. Only after the server reboot I see a Java thread load to 100% 1 core
- the files that i will upload are a big "mix file": text, Word, Open Office, PDF, Jpg, php, c, java, ecc... Total is, around, 3TB and a few milon files
- to upload the files I have prepare some (big) ZIP file and I upload it as "Import Documents from ZIP"
- the system is CentOS 7.1, Java "1.8.0_65" OpenJDK 64-Bit, OpenKM 6.3.1 build 8235 bundle, MariaDB backend and with the extra software Tesseract, pdf2swf, ImageMagick, Libreoffice and wkhtmltopdf
- another strange thing is no statics (I try to run by hand the process, but... nothing). Screenshot beelow:
img2.png
img2.png (117.2 KiB) Viewed 4807 times
- and a second strange behavior: when upload a zip file bigger than around 600MB the browser not show the finish "Process file..." (but, from the tomcat's log, it is finished and I can see all file and I can research each new files)
- in the log of Tomcat I can't see specifc error. Only I see some extract errors like the follow examples
Code: Select all
...
2015-12-07 20:50:00,539 [Thread-1193] WARN  com.openkm.dao.NodeDocumentDAO- There was a problem extracting text from '/okm:root/Progetti/Sito fondazionedonorione.org/sito_ORG/core_clone_20110403_150000/images/stories/editoriali/jsn_addiocrocifisso/007_addio_crocifisso.jpg': /opt/openkm/temp/okm1160755861317877225.txt (No such file or directory)
..
2015-12-06 10:20:06,758 [Thread-2010] WARN  org.apache.jackrabbit.extractor.MsWordTextExtractor- Failed to extract Word text content
org.apache.poi.hwpf.OldWordFileFormatException: The document is too old - Word 95 or older. Try HWPFOldDocument instead?
...
Thank you in advanced.
 #41020  by jllort
 
When you upload documents they go into pending indexing queue. Each 5 minutes ( if you have not changed anything ) the documents are being processed from the queue to extract contents ( text ) for indexing purpose. Could be several reason why a process goes to 100% and not stops ( usually a complex file, could be xls or similar is locking the queue for some reason ), should investigate if the file in queue is never finishing ( Administration -> Stats -> Pending stat queue ).

I suggest increase tomcat memory at least to 4GB.
Before the huge uploads is also good idea check each mime type to be sure you have everything configured correctly, when you have error in text extractors could be by several reason ( older openoffice installed, not compatible word file, etc... ). Do a couple of test before starting is a good practice if you are not sure all is well configured.
About importing big files from desktop, our suggestion is get all files from server and import from there with administration import tool. About process not finishing with 600mb files or upper, should see if it happens some error in catalina.log ( you are working always on intranet or is another scenario ? )
We suggest Oracle JDK rather Open JDK, basically because is the version what we use on development and production ( that not means with Open JDK will not running, but better if you use the same than use ).
 #41021  by stefbort
 
Hi jllort.
I change java from OpenJDK to Oracle JDK.
About the pending queue I have around 164892 (!) extractions in queue! I attach 2 screenshot. Probably this is my problem!
img3.png
img3.png (245.29 KiB) Viewed 4799 times
img4.png
img4.png (102.83 KiB) Viewed 4799 times
How can I accelerate the extract process?

About my network configuration is LAN (PC - switch - SERVER). I am using, as client, Ubuntu with Firefox and Chromium but this my PC is computer to test and developing. I have one other OpenKM installation with around 10.000 files for 1GB total space in a different site. The client PCs are with Windows, Internet Explorer and Firefox with 10 users and I haven't meet any problem.
My be is my linux client with a problem.

Thank you again.
 #41031  by jllort
 
You have a lot of files pending to be processed by queue ( it's done by text extractor task in crontab view ). You should watch if the docu ments in queue decrease in time or not ?
 #41070  by jllort
 
When OpenKM process PDF files ( take as example ) it needs a lot of cpu to process ( usually 100% ). Try go to administration > crontab > and disable a task named "text extractor worker". Wait for 10 minutes more or less and look then for CPU usage.

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.