Page 1 of 1

How does searching work?

PostPosted:Wed Sep 21, 2016 6:14 pm
by Bummibaer
Hi,
how can I see/monitor the state of the search?
So far I have done:
import huge bulk of pdfs
In the catalina.log I see:
2016-09-18 20:00:00,579 [Thread-295] INFO com.openkm.extractor.TextExtractorWorker- processSerial.Working on {docUuid=244baacb-c423-4fe7-8175-5efa80f555cd, docPath=/okm:personal/steffen/Uploaded/cfoo.pdf, docVerUuid=f31d2702-c446-4ab4-bda0-caf738222e10, date=Sun Sep 18 19:06:16 CEST 2016}
I can preview this.
I've searched words, contained in almost every of the documents,
but in search, i didn't get any results!?
Why? How to debug?
I've searched the log further, I found this:
First, I shutdown the server before the end of indexing. Is this dangerous?
And when yes, how do I get as user a state of the Server?
Second I found this:
com.openkm.util.ExecutionUtils- Unable to read jar: C:\openkm-6.3.1-community\tomcat\stop.jar

regards,
a bloody newbie

Re: How does searching work?

PostPosted:Fri Sep 23, 2016 9:52 am
by jllort
Is not dangerous shutting application if you always do it with correct command like ( seems is your case ).

When documents are created, these going into "text extractor pending" queue ( you can check it at Administration > Stats feature ). The documents are processed by the queue ( can configure more or less agressive method ), depending how many documents you have uploaded can take minutes, hours or days.

In database the column NDC_TEXT_EXTRACTED ( values T or F ) in table OKM_NODE_DOCUMENT indicate is the document has been yet processed or not.

Re: How does searching work?

PostPosted:Sun Oct 30, 2016 4:20 pm
by Bummibaer
Hello,

I' reinstalled all the stuff. And don't get any results.
If I open the document by preview, the search field in it foo. But in the Search tab nothing is displayed.
I also rebuild indexed from Admin ( textextract, lucene index) with no errors reported.
Check Text extraction does work.

Do I something wrong?

I know the difficulties with documentation. But is there a good tutorial for beginners.

regards,
Steffen

Re: How does searching work?

PostPosted:Mon Oct 31, 2016 10:57 pm
by jllort
Can you upload the document here for making some checks ?

Re: How does searching work?

PostPosted:Wed Nov 02, 2016 7:07 pm
by Bummibaer
Rather not:) Are my financial data....
But here are my solr Example.pdf

I cannot attach a log-file, instead I insert it here.

regards
Steffen
Code: Select all
2016-11-02 19:53:38,892 [localhost-startStop-1] INFO  com.openkm.util.DocConverter- *** Build Office Manager ***
2016-11-02 19:53:38,892 [localhost-startStop-1] INFO  com.openkm.util.DocConverter- system.openoffice.path=C:\Program Files\LibreOffice 5
2016-11-02 19:53:38,892 [localhost-startStop-1] INFO  com.openkm.util.DocConverter- system.openoffice.tasks=200
2016-11-02 19:53:38,892 [localhost-startStop-1] INFO  com.openkm.util.DocConverter- system.openoffice.port=2002
2016-11-02 19:53:38,936 [localhost-startStop-1] INFO  org.artofsolving.jodconverter.office.ProcessPoolOfficeManager- ProcessManager implementation is SigarProcessManager
2016-11-02 19:53:39,008 [localhost-startStop-1] WARN  com.openkm.servlet.RepositoryStartupServlet- failed to start and connect
org.artofsolving.jodconverter.office.OfficeException: failed to start and connect
	at org.artofsolving.jodconverter.office.ManagedOfficeProcess.startAndWait(ManagedOfficeProcess.java:64)
	at org.artofsolving.jodconverter.office.PooledOfficeManager.start(PooledOfficeManager.java:101)
	at org.artofsolving.jodconverter.office.ProcessPoolOfficeManager.start(ProcessPoolOfficeManager.java:62)
	at com.openkm.util.DocConverter.start(DocConverter.java:190)
	at com.openkm.servlet.RepositoryStartupServlet.start(RepositoryStartupServlet.java:279)
	at com.openkm.servlet.RepositoryStartupServlet.init(RepositoryStartupServlet.java:127)
	at javax.servlet.GenericServlet.init(GenericServlet.java:158)
	at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1284)
	at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1197)
	at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1087)
	at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:5266)
	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5554)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
	at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
	at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
	at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
	at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1090)
	at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1900)
	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
Caused by: java.util.concurrent.ExecutionException: java.lang.UnsatisfiedLinkError: org.hyperic.sigar.ptql.SigarProcessQuery.create(Ljava/lang/String;)V
	at java.util.concurrent.FutureTask.report(Unknown Source)
	at java.util.concurrent.FutureTask.get(Unknown Source)
	at org.artofsolving.jodconverter.office.ManagedOfficeProcess.startAndWait(ManagedOfficeProcess.java:62)
	... 22 more
Caused by: java.lang.UnsatisfiedLinkError: org.hyperic.sigar.ptql.SigarProcessQuery.create(Ljava/lang/String;)V
	at org.hyperic.sigar.ptql.SigarProcessQuery.create(Native Method)
	at org.hyperic.sigar.ptql.ProcessQueryFactory.getQuery(ProcessQueryFactory.java:66)
	at org.hyperic.sigar.ptql.ProcessFinder.find(ProcessFinder.java:68)
	at org.hyperic.sigar.ptql.ProcessFinder.find(ProcessFinder.java:56)
	at org.artofsolving.jodconverter.process.SigarProcessManager.findPid(SigarProcessManager.java:42)
	at org.artofsolving.jodconverter.office.OfficeProcess.start(OfficeProcess.java:65)
	at org.artofsolving.jodconverter.office.OfficeProcess.start(OfficeProcess.java:60)
	at org.artofsolving.jodconverter.office.ManagedOfficeProcess.doStartProcessAndConnect(ManagedOfficeProcess.java:119)
	at org.artofsolving.jodconverter.office.ManagedOfficeProcess.access$000(ManagedOfficeProcess.java:31)
	at org.artofsolving.jodconverter.office.ManagedOfficeProcess$1.run(ManagedOfficeProcess.java:58)
	... 5 more
2016-11-02 19:53:39,020 [localhost-startStop-1] INFO  com.openkm.extension.core.ExtensionManager- Initialize and load plugins...
2016-11-02 19:53:39,097 [localhost-startStop-1] INFO  com.openkm.servlet.RepositoryStartupServlet- *** Execute start script ***
2016-11-02 19:53:39,099 [localhost-startStop-1] WARN  com.openkm.util.ExecutionUtils- Unable to read script: C:\openkm-6.3.1-community\tomcat\start.bsh
2016-11-02 19:53:39,099 [localhost-startStop-1] WARN  com.openkm.util.ExecutionUtils- Unable to read jar: C:\openkm-6.3.1-community\tomcat\start.jar
2016-11-02 19:53:39,099 [localhost-startStop-1] INFO  com.openkm.servlet.RepositoryStartupServlet- *** Execute start SQL ***
2016-11-02 19:53:39,099 [localhost-startStop-1] WARN  com.openkm.servlet.RepositoryStartupServlet- Unable to read sql: C:\openkm-6.3.1-community\tomcat\start.sql
2016-11-02 19:53:39,121 [localhost-startStop-1] WARN  org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet- CMIS version is not defined! Setting it to CMIS 1.0.
2016-11-02 19:53:39,189 [localhost-startStop-1] INFO  org.apache.catalina.startup.HostConfig- Deployment of web application archive C:\openkm-6.3.1-community\tomcat\webapps\OpenKM.war has finished in 42,472 ms
2016-11-02 19:53:39,190 [localhost-startStop-1] INFO  org.apache.catalina.startup.HostConfig- Deploying web application directory C:\openkm-6.3.1-community\tomcat\webapps\ROOT
2016-11-02 19:53:40,424 [localhost-startStop-1] INFO  org.apache.catalina.startup.HostConfig- Deployment of web application directory C:\openkm-6.3.1-community\tomcat\webapps\ROOT has finished in 1,234 ms
2016-11-02 19:53:40,425 [main] INFO  org.apache.coyote.http11.Http11AprProtocol- Starting ProtocolHandler ["http-apr-0.0.0.0-8080"]
2016-11-02 19:53:40,435 [main] INFO  org.apache.coyote.ajp.AjpAprProtocol- Starting ProtocolHandler ["ajp-apr-127.0.0.1-8009"]
2016-11-02 19:53:40,436 [main] INFO  org.apache.catalina.startup.Catalina- Server startup in 43808 ms
2016-11-02 19:54:14,837 [http-apr-0.0.0.0-8080-exec-7] INFO  com.openkm.module.common.CommonAuthModule- PrincipalAdapter: com.openkm.principal.DatabasePrincipalAdapter
2016-11-02 19:54:14,960 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.config.GlobalSettings- Trying to find Dozer configuration file: dozer.properties
2016-11-02 19:54:14,985 [http-apr-0.0.0.0-8080-exec-6] WARN  org.dozer.config.GlobalSettings- Dozer configuration file not found: dozer.properties.  Using defaults for all Dozer global properties.
2016-11-02 19:54:14,988 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.DozerInitializer- Initializing Dozer. Version: 5.3.2, Thread Name: http-apr-0.0.0.0-8080-exec-6
2016-11-02 19:54:14,991 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.jmx.JMXPlatformImpl- Dozer JMX MBean [org.dozer.jmx:type=DozerStatisticsController] auto registered with the Platform MBean Server
2016-11-02 19:54:14,992 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.jmx.JMXPlatformImpl- Dozer JMX MBean [org.dozer.jmx:type=DozerAdminController] auto registered with the Platform MBean Server
2016-11-02 19:54:14,997 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.DozerBeanMapper- Initializing a new instance of dozer bean mapper.
2016-11-02 19:54:15,015 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.loader.CustomMappingsLoader- Using the following xml files to load custom mappings for the bean mapper instance: [dozerBeanMapping.xml]
2016-11-02 19:54:15,016 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.loader.CustomMappingsLoader- Trying to find xml mapping file: dozerBeanMapping.xml
2016-11-02 19:54:15,018 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.loader.CustomMappingsLoader- Using URL [file:/C:/openkm-6.3.1-community/tomcat/webapps/OpenKM/WEB-INF/classes/dozerBeanMapping.xml] to load custom xml mappings
2016-11-02 19:54:15,035 [http-apr-0.0.0.0-8080-exec-6] INFO  org.dozer.loader.CustomMappingsLoader- Successfully loaded custom xml mappings from URL: [file:/C:/openkm-6.3.1-community/tomcat/webapps/OpenKM/WEB-INF/classes/dozerBeanMapping.xml]
2016-11-02 19:54:15,138 [http-apr-0.0.0.0-8080-exec-8] INFO  com.openkm.vernum.VersionNumerationFactory- VersionNumerationAdapter: com.openkm.vernum.MajorMinorVersionNumerationAdapter
2016-11-02 19:55:26,899 [Hibernate Search: Directory writer-1] ERROR org.hibernate.search.exception.impl.LogErrorHandler- Exception occurred org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@C:\openkm-6.3.1-community\tomcat\repository\index\com.openkm.dao.bean.NodeBase\write.lock
Primary Failure:
	Entity com.openkm.dao.bean.NodeDocument  Id 5801e5e2-b115-4b9d-98ae-84c02a7f277e  Work Type  org.hibernate.search.backend.AddLuceneWork

org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@C:\openkm-6.3.1-community\tomcat\repository\index\com.openkm.dao.bean.NodeBase\write.lock
	at org.apache.lucene.store.Lock.obtain(Lock.java:84)
	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1097)
	at org.hibernate.search.backend.Workspace.createNewIndexWriter(Workspace.java:202)
	at org.hibernate.search.backend.Workspace.getIndexWriter(Workspace.java:180)
	at org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor.run(PerDPQueueProcessor.java:103)
	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
2016-11-02 19:55:26,900 [Hibernate Search: Directory writer-1] ERROR org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor- Unexpected error in Lucene Backend: 
java.lang.NullPointerException
	at org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate.performWork(AddWorkDelegate.java:76)
	at org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor.run(PerDPQueueProcessor.java:106)
	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
2016-11-02 19:55:26,901 [Hibernate Search: Directory writer-1] ERROR org.hibernate.search.exception.impl.LogErrorHandler- Exception occurred java.lang.NullPointerException
Primary Failure:
	Entity com.openkm.dao.bean.NodeDocument  Id 5801e5e2-b115-4b9d-98ae-84c02a7f277e  Work Type  org.hibernate.search.backend.AddLuceneWork

java.lang.NullPointerException
	at org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate.performWork(AddWorkDelegate.java:76)
	at org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor.run(PerDPQueueProcessor.java:106)
	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
2016-11-02 19:55:26,933 [Hibernate Search: Directory writer-1] WARN  org.hibernate.search.backend.Workspace- going to force release of the IndexWriter lock
2016-11-02 19:55:46,304 [http-apr-0.0.0.0-8080-exec-6] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:44,811 [http-apr-0.0.0.0-8080-exec-9] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:45,449 [http-apr-0.0.0.0-8080-exec-7] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:45,926 [http-apr-0.0.0.0-8080-exec-6] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:46,557 [http-apr-0.0.0.0-8080-exec-4] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:47,108 [http-apr-0.0.0.0-8080-exec-5] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:47,677 [http-apr-0.0.0.0-8080-exec-8] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:48,204 [http-apr-0.0.0.0-8080-exec-3] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:48,924 [http-apr-0.0.0.0-8080-exec-2] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:56:59,080 [http-apr-0.0.0.0-8080-exec-10] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 19:58:40,965 [Update Info] INFO  com.openkm.core.UpdateInfo- *** UpdateInfo activated ***
2016-11-02 19:58:41,120 [Update Info] INFO  com.openkm.util.Update- checkVersion: 
2016-11-02 19:59:49,624 [http-apr-0.0.0.0-8080-exec-6] INFO  com.openkm.dao.SearchDAO- buildFilter: userPermission:steffen rolePermission:ROLE_USER
2016-11-02 20:00:00,028 [Thread-20] INFO  com.openkm.core.UserMailImporter- *** User mail importer activated ***
2016-11-02 20:00:00,035 [Thread-21] INFO  com.openkm.extractor.TextExtractorWorker- processSerial.Working on {docUuid=5801e5e2-b115-4b9d-98ae-84c02a7f277e, docPath=/okm:personal/steffen/Uploads/solr-word.pdf, docVerUuid=bc296645-79b2-488c-b98c-1b0192b53a0b, date=Wed Nov 02 19:55:25 CET 2016}