Re-indexing not starting

We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
Forum rules
Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
Post Reply
openkm_user
Expert Boarder
Expert Boarder
Posts: 130
Joined: Thu Dec 17, 2015 7:38 am

Re-indexing not starting

Post by openkm_user » Tue Sep 17, 2019 4:33 pm

Hello,

We have used re-indexing feature for a while now. Recently we upgraded to OpenKM version 6.3.8 and tried to do lucene re-indexing, but every time it starts and doesn't go through,
Untitled.png
Only the above is logged in re-index log file and then no progress.
Production Configuration: OpenKM Community 6.3.8 | Windows Server 2016 | Intel (R) Xeon (R) CPU E5-2670 2.60GHz | RAM: 80GB | 16 Core | MySQL 5.7.14 | Virtualized

openkm_user
Expert Boarder
Expert Boarder
Posts: 130
Joined: Thu Dec 17, 2015 7:38 am

Re: Re-indexing not starting

Post by openkm_user » Thu Sep 19, 2019 10:25 am

Do we have to enable tracing in logs? Are we missing something? Please help!
Production Configuration: OpenKM Community 6.3.8 | Windows Server 2016 | Intel (R) Xeon (R) CPU E5-2670 2.60GHz | RAM: 80GB | 16 Core | MySQL 5.7.14 | Virtualized

jllort
Moderator
Moderator
Posts: 10868
Joined: Fri Dec 21, 2007 11:23 am
Location: Sineu - ( Illes Balears ) - Spain
Contact:

Re: Re-indexing not starting

Post by jllort » Sat Sep 21, 2019 4:03 pm

If you want to take a look in the source code ( the method luceneIndexesFlushToIndexes is executed from class RebuildIndexesServlet ):
https://github.com/openkm/document-mana ... .java#L175

In this branch, I have added extra detail in the process of rebuild indexes ( compile and update with the code in this branch )
https://github.com/openkm/document-mana ... /issue/200

Also in the setenv.sh or setenv.bat check if you have enabled the headdump enabled ( in case critical error the JVM creates a dump )

Code: Select all

JAVA_OPTS="$JAVA_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$CATALINA_HOME"
You can check here for JVM parameters:
https://www.oracle.com/technetwork/java ... 40102.html

The glowroot tools might help you a lot cotrolling the JVM heap in real time:
https://glowroot.org/
Require some extra parameter:

Code: Select all

JAVA_OPTS="$JAVA_OPTS -javaagent:$CATALINA_HOME/bin/glowroot.jar"
if you are in Linux can you share your current setenv.sh ( if you are in Windows Service -> make some snapshot of the openkmw.exe tool to watch your current memory configuration).

Are you using JDK 1.8 ?

openkm_user
Expert Boarder
Expert Boarder
Posts: 130
Joined: Thu Dec 17, 2015 7:38 am

Re: Re-indexing not starting

Post by openkm_user » Thu Sep 26, 2019 2:34 pm

Deleting the folders inside index did start re-indexing, but it stopped after a while throwing the following exception,

Code: Select all

Exception in thread "Hibernate Search: indexwriter-6" Exception in thread "Hibernate Search: indexwriter-1" java.lang.IllegalArgumentException
	at java.nio.Buffer.position(Unknown Source)
	at java.nio.HeapByteBuffer.put(Unknown Source)
	at org.apache.tomcat.util.net.SocketWrapperBase.transfer(SocketWrapperBase.java:1044)
	at org.apache.tomcat.util.net.SocketWrapperBase.writeBlocking(SocketWrapperBase.java:448)
	at org.apache.tomcat.util.net.SocketWrapperBase.write(SocketWrapperBase.java:388)
	at org.apache.coyote.http11.Http11OutputBuffer$SocketOutputBuffer.doWrite(Http11OutputBuffer.java:644)
	at org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:123)
	at org.apache.coyote.http11.Http11OutputBuffer.doWrite(Http11OutputBuffer.java:235)
	at org.apache.coyote.Response.doWrite(Response.java:541)
	at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:351)
	at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:815)
	at org.apache.catalina.connector.OutputBuffer.realWriteChars(OutputBuffer.java:456)
	at org.apache.catalina.connector.OutputBuffer.flushCharBuffer(OutputBuffer.java:820)
	at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:307)
	at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:284)
	at org.apache.catalina.connector.CoyoteWriter.flush(CoyoteWriter.java:94)
	at org.springframework.security.web.context.OnCommittedResponseWrapper$SaveContextPrintWriter.flush(OnCommittedResponseWrapper.java:231)
	at com.openkm.servlet.admin.RebuildIndexesServlet$ProgressMonitor.documentsAdded(RebuildIndexesServlet.java:463)
	at org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate.logWorkDone(AddWorkDelegate.java:117)
	at org.hibernate.search.backend.impl.batchlucene.DirectoryProviderWorkspace.doWorkInSync(DirectoryProviderWorkspace.java:97)
	at org.hibernate.search.backend.impl.batchlucene.DirectoryProviderWorkspace$AsyncIndexRunnable.run(DirectoryProviderWorkspace.java:144)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
java.nio.BufferOverflowException
	at java.nio.CharBuffer.put(Unknown Source)
	at org.apache.catalina.connector.OutputBuffer.transfer(OutputBuffer.java:860)
	at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:521)
	at org.apache.catalina.connector.CoyoteWriter.write(CoyoteWriter.java:170)
	at org.apache.catalina.connector.CoyoteWriter.write(CoyoteWriter.java:180)
	at org.apache.catalina.connector.CoyoteWriter.print(CoyoteWriter.java:238)
	at org.springframework.security.web.context.OnCommittedResponseWrapper$SaveContextPrintWriter.print(OnCommittedResponseWrapper.java:317)
	at com.openkm.servlet.admin.RebuildIndexesServlet$ProgressMonitor.documentsAdded(RebuildIndexesServlet.java:457)
	at org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate.logWorkDone(AddWorkDelegate.java:117)
	at org.hibernate.search.backend.impl.batchlucene.DirectoryProviderWorkspace.doWorkInSync(DirectoryProviderWorkspace.java:97)
	at org.hibernate.search.backend.impl.batchlucene.DirectoryProviderWorkspace$AsyncIndexRunnable.run(DirectoryProviderWorkspace.java:144)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
What could be going wrong?
Production Configuration: OpenKM Community 6.3.8 | Windows Server 2016 | Intel (R) Xeon (R) CPU E5-2670 2.60GHz | RAM: 80GB | 16 Core | MySQL 5.7.14 | Virtualized

jllort
Moderator
Moderator
Posts: 10868
Joined: Fri Dec 21, 2007 11:23 am
Location: Sineu - ( Illes Balears ) - Spain
Contact:

Re: Re-indexing not starting

Post by jllort » Fri Sep 27, 2019 12:01 pm

Did you download and compiled the branch I indicated?
The error java.nio.BufferOverflowException seems something not much good.

Try the next steps:
1- Execute the next query ( will clean previous text extracted from the database )

Code: Select all

UPDATE OKM_NODE_DOCUMENT SET NDC_TEXT=''
2- Try now to rebuild the indexes
3- Set all the documents in the queue to extract content again ( I suspect you have a big document what is killing the rebuild index process ). After the rebuild indexes be completed, execute:

Code: Select all

UPDATE OKM_NODE_DOCUMENT SET NDC_TEXT_EXTRACTED='F'
The queue of the documents for being indexed by content will be full, and they will start the analysis process from the beginning. Until the process be finished ( some hours ... depending on the number and type of your documents ) you will not be able to find documents by content.

Post Reply