• Re-indexing not starting

  • We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #48901  by openkm_user
 
Hello,

We have used re-indexing feature for a while now. Recently we upgraded to OpenKM version 6.3.8 and tried to do lucene re-indexing, but every time it starts and doesn't go through,
Untitled.png
Untitled.png (40.97 KiB) Viewed 2168 times
Only the above is logged in re-index log file and then no progress.
 #48913  by openkm_user
 
Do we have to enable tracing in logs? Are we missing something? Please help!
 #48926  by jllort
 
If you want to take a look in the source code ( the method luceneIndexesFlushToIndexes is executed from class RebuildIndexesServlet ):
https://github.com/openkm/document-mana ... .java#L175

In this branch, I have added extra detail in the process of rebuild indexes ( compile and update with the code in this branch )
https://github.com/openkm/document-mana ... /issue/200

Also in the setenv.sh or setenv.bat check if you have enabled the headdump enabled ( in case critical error the JVM creates a dump )
Code: Select all
JAVA_OPTS="$JAVA_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$CATALINA_HOME"
You can check here for JVM parameters:
https://www.oracle.com/technetwork/java ... 40102.html

The glowroot tools might help you a lot cotrolling the JVM heap in real time:
https://glowroot.org/
Require some extra parameter:
Code: Select all
JAVA_OPTS="$JAVA_OPTS -javaagent:$CATALINA_HOME/bin/glowroot.jar"
if you are in Linux can you share your current setenv.sh ( if you are in Windows Service -> make some snapshot of the openkmw.exe tool to watch your current memory configuration).

Are you using JDK 1.8 ?
 #48953  by openkm_user
 
Deleting the folders inside index did start re-indexing, but it stopped after a while throwing the following exception,
Code: Select all
Exception in thread "Hibernate Search: indexwriter-6" Exception in thread "Hibernate Search: indexwriter-1" java.lang.IllegalArgumentException
	at java.nio.Buffer.position(Unknown Source)
	at java.nio.HeapByteBuffer.put(Unknown Source)
	at org.apache.tomcat.util.net.SocketWrapperBase.transfer(SocketWrapperBase.java:1044)
	at org.apache.tomcat.util.net.SocketWrapperBase.writeBlocking(SocketWrapperBase.java:448)
	at org.apache.tomcat.util.net.SocketWrapperBase.write(SocketWrapperBase.java:388)
	at org.apache.coyote.http11.Http11OutputBuffer$SocketOutputBuffer.doWrite(Http11OutputBuffer.java:644)
	at org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:123)
	at org.apache.coyote.http11.Http11OutputBuffer.doWrite(Http11OutputBuffer.java:235)
	at org.apache.coyote.Response.doWrite(Response.java:541)
	at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:351)
	at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:815)
	at org.apache.catalina.connector.OutputBuffer.realWriteChars(OutputBuffer.java:456)
	at org.apache.catalina.connector.OutputBuffer.flushCharBuffer(OutputBuffer.java:820)
	at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:307)
	at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:284)
	at org.apache.catalina.connector.CoyoteWriter.flush(CoyoteWriter.java:94)
	at org.springframework.security.web.context.OnCommittedResponseWrapper$SaveContextPrintWriter.flush(OnCommittedResponseWrapper.java:231)
	at com.openkm.servlet.admin.RebuildIndexesServlet$ProgressMonitor.documentsAdded(RebuildIndexesServlet.java:463)
	at org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate.logWorkDone(AddWorkDelegate.java:117)
	at org.hibernate.search.backend.impl.batchlucene.DirectoryProviderWorkspace.doWorkInSync(DirectoryProviderWorkspace.java:97)
	at org.hibernate.search.backend.impl.batchlucene.DirectoryProviderWorkspace$AsyncIndexRunnable.run(DirectoryProviderWorkspace.java:144)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
java.nio.BufferOverflowException
	at java.nio.CharBuffer.put(Unknown Source)
	at org.apache.catalina.connector.OutputBuffer.transfer(OutputBuffer.java:860)
	at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:521)
	at org.apache.catalina.connector.CoyoteWriter.write(CoyoteWriter.java:170)
	at org.apache.catalina.connector.CoyoteWriter.write(CoyoteWriter.java:180)
	at org.apache.catalina.connector.CoyoteWriter.print(CoyoteWriter.java:238)
	at org.springframework.security.web.context.OnCommittedResponseWrapper$SaveContextPrintWriter.print(OnCommittedResponseWrapper.java:317)
	at com.openkm.servlet.admin.RebuildIndexesServlet$ProgressMonitor.documentsAdded(RebuildIndexesServlet.java:457)
	at org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate.logWorkDone(AddWorkDelegate.java:117)
	at org.hibernate.search.backend.impl.batchlucene.DirectoryProviderWorkspace.doWorkInSync(DirectoryProviderWorkspace.java:97)
	at org.hibernate.search.backend.impl.batchlucene.DirectoryProviderWorkspace$AsyncIndexRunnable.run(DirectoryProviderWorkspace.java:144)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
What could be going wrong?
 #48969  by jllort
 
Did you download and compiled the branch I indicated?
The error java.nio.BufferOverflowException seems something not much good.

Try the next steps:
1- Execute the next query ( will clean previous text extracted from the database )
Code: Select all
UPDATE OKM_NODE_DOCUMENT SET NDC_TEXT=''
2- Try now to rebuild the indexes
3- Set all the documents in the queue to extract content again ( I suspect you have a big document what is killing the rebuild index process ). After the rebuild indexes be completed, execute:
Code: Select all
UPDATE OKM_NODE_DOCUMENT SET NDC_TEXT_EXTRACTED='F'
The queue of the documents for being indexed by content will be full, and they will start the analysis process from the beginning. Until the process be finished ( some hours ... depending on the number and type of your documents ) you will not be able to find documents by content.

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.