Page 1 of 4

Error in application/msword to PDF conversion

PostPosted:Mon Nov 22, 2010 5:11 pm
by Erno
Debian Lenny here, OpenKM 5.0 RC1, with the following packages:
Code: Select all
ii  java-common                         0.30                        Base of all Java packages
ii  sun-java6-bin                       6-20-0lenny1                Sun Java(TM) Runtime Environment (JRE) 6 (ar
ii  sun-java6-jdk                       6-20-0lenny1                Sun Java(TM) Development Kit (JDK) 6
ii  sun-java6-jre                       6-20-0lenny1                Sun Java(TM) Runtime Environment (JRE) 6 (ar
ii  openoffice.org-common               1:3.2.1-6~bpo50+1           office productivity suite -- arch-independen
ii  openoffice.org-core                 1:3.2.1-6~bpo50+1+b1        office productivity suite -- arch-dependent
ii  openoffice.org-style-galaxy         1:3.2.1-6~bpo50+1           office productivity suite -- Galaxy (Default
ii  swftools                            0.9.0-0ubuntu1              Collection of utilities for SWF file manipul
PDF and JPG preview work fine. However, when I try to preview a Word document, it prompts "Document URL not provided or invalid".

I can see the next exception in the logs:
Code: Select all
17:55:02,383 ERROR [DocConverter] Error in application/msword to PDF conversion
17:55:02,384 ERROR [OKMDownloadServlet] Error in application/msword to PDF conversion
java.io.IOException: Error in application/msword to PDF conversion
        at com.openkm.util.DocConverter.doc2pdf(DocConverter.java:190)
        at com.openkm.frontend.server.OKMDownloadServlet.service(OKMDownloadServlet.java:145)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
        at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:432)
        at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Error convertind document: could not load document: okm1659961237065297293.doc
        at com.openkm.util.DocConverter.convert(DocConverter.java:166)
        at com.openkm.util.DocConverter.doc2pdf(DocConverter.java:185)
        ... 21 more
Other openoffice related lines generated by the startup process:
Code: Select all
17:53:03,823 ERROR [STDERR] Nov 22, 2010 5:53:03 PM org.artofsolving.jodconverter.office.ProcessPoolOfficeManager <init>
INFO: ProcessManager implementation is UnixProcessManager
17:53:03,898 ERROR [STDERR] Nov 22, 2010 5:53:03 PM org.artofsolving.jodconverter.office.OfficeProcess start
INFO: starting process with acceptString 'socket,host=127.0.0.1,port=2002,tcpNoDelay=1' and profileDir '/tmp/.jodconverter_socket_host-127.0.0.1_port-2002'
17:53:03,988 ERROR [STDERR] Nov 22, 2010 5:53:03 PM org.artofsolving.jodconverter.office.OfficeProcess start
INFO: started process; pid = 31156
17:53:04,895 ERROR [STDERR] Nov 22, 2010 5:53:04 PM org.artofsolving.jodconverter.office.OfficeConnection connect
INFO: connected: 'socket,host=127.0.0.1,port=2002,tcpNoDelay=1'
Complete log is available here:
http://tequila.sda.bme.hu/~rsc/openkm.log

I am looking for your hints...

(Developers, also note the typo in the exception message "convertind" ;)

Re: Error in application/msword to PDF conversion

PostPosted:Mon Nov 22, 2010 5:21 pm
by Erno
OpenKM.cfg:
Code: Select all
# Default configuration values
#
# repository.config=repository.xml
# repository.home=repository
# system.user=system
# default.user.role=UserRol
# default.admin.role=AdminRol
# principal.adapter=com.openkm.core.UsersRolesPrincipalAdapter
# max.file.size=5
# max.search.results=25
# system.demo=off
restrict.file.mime=off
restrict.file.extension=*~,*.bak,._*
system.ocr=/usr/bin/tesseract
system.openoffice=/usr/lib/openoffice
system.img2pdf=/usr/bin/convert
system.pdf2swf=/usr/bin/pdf2swf
system.antivir=/usr/local/clamav/bin/clamscan
#hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
hibernate.dialect=org.hibernate.dialect.HSQLDialect
hibernate.hbm2ddl=none
application.url=http://dms2.foobar.hu:8080/OpenKM/com.openkm.frontend.Main/index.jsp
max.file.size=060520010
#principal.adapter=com.openkm.principal.DatabasePrincipalAdapter
#notify.twitter.user=openkm
#notify.twitter.password=****
#subscription.message.body=Prueba.vm

Re: Error in application/msword to PDF conversion

PostPosted:Thu Nov 25, 2010 9:09 pm
by pavila
You configuration seems to be ok. Can you reproduce this issue in the online demo? http://demo.openkm.com

PD: Typo fixed in trunk :P

Re: Error in application/msword to PDF conversion

PostPosted:Thu Nov 25, 2010 9:22 pm
by Erno
Unfortunately I cannot, online demo with the same doc works just fine.

Re: Error in application/msword to PDF conversion

PostPosted:Sun Nov 28, 2010 11:45 pm
by Erno
Could you please add more specific details into the exception message then, so we could debug this issue finally?

Re: Error in application/msword to PDF conversion

PostPosted:Mon Nov 29, 2010 6:08 pm
by pavila
There are also DEBUG level message, but by default are not shown in the console. You can see them in the server/default/log/server.log file.

Re: Error in application/msword to PDF conversion

PostPosted:Tue Nov 30, 2010 9:57 am
by Erno
http://tequila.sda.bme.hu/~rsc/server.debug.log

I cant see anything more specific around "could not load document" :(

Re: Error in application/msword to PDF conversion

PostPosted:Tue Nov 30, 2010 10:47 am
by jllort
Has been created in jboss a folder called cache ?
The document okm7814016977585986709.doc exists ? Can you downloading it ? I think you can not downloading etc... I think okm7814016977585986709.doc is a temporal doc generated by openkm for conversion purpose, can that user write and read in tmp folder ?
Have you made some change in source code ?

Re: Error in application/msword to PDF conversion

PostPosted:Wed Dec 01, 2010 5:47 pm
by Erno
Openkm/jboss runs as root, it has write permission to everything, including /tmp/.jodconverter_socket_host-127.0.0.1_port-2002/. Each partiation is mounted with default mount options. I did not change anything in the sources.
I have a cache folder, and it has some pdf/swf previews generated for the other filetypes.

Any other hints?

Re: Error in application/msword to PDF conversion

PostPosted:Wed Dec 01, 2010 6:56 pm
by pavila
You should check that "soffice" program is launched properly (ps -ef | grep soffice), and the path to pdf2swf executable is the right one. This error is shown with any document? For example, try to preview .pdf, .doc, and .odt documents and check that is working (or fails).

Re: Error in application/msword to PDF conversion

PostPosted:Wed Dec 01, 2010 7:14 pm
by jllort
The compromised code is
Code: Select all
File tmp = File.createTempFile("okm", ".doc");
		FileOutputStream fos = null;
		
		try {
			long start = System.currentTimeMillis();
			fos = new FileOutputStream(tmp);
			IOUtils.copy(is, fos);
That are the lines where the error happens ... for some reason I think in your system it's not creating the temp file ... I can say what I think is happening but not why, are several possibilities ... max file open number opened reached could one.... etc... but for some reason as I said the temp file is not created in temporal folder.

Re: Error in application/msword to PDF conversion

PostPosted:Wed Dec 01, 2010 7:52 pm
by Erno
# ps -ef|grep soff
root 24582 24579 0 20:50 pts/1 00:00:00 grep soff
root 31156 31082 0 Nov22 ? 00:00:07 /usr/lib/openoffice/program/soffice.bin -accept=socket,host=127.0.0.1,port=2002;urp; -env:UserInstallation=file:///tmp/.jodconverter_socket_host-127.0.0.1_port-2002 -headless -nocrashreport -nodefault -nofirststartwizard -nolockcheck -nologo -norestore

As I mentioned in the very first post, .pdf preview works fine, which means pdf2swf path is correct.

# mount |grep tmp
/dev/md5 on /tmp type xfs (rw)

Re: Error in application/msword to PDF conversion

PostPosted:Thu Dec 02, 2010 7:01 am
by jllort
NO NO NO NO, Now I think I understand what's happening.

From version 5.X OpenKM automatically runs Open Office you must not starting as service, OpenKM runs for you. If service is yet started then OpenKM can not initiate own, and that's I think it's the problem. You can preview pdf because there's no open office conversion needed only pdf2swf, but you can not preview other doc or xls documents - for example - because in that cases we convert to pdf, and without open office it's not possible.

Stop jboss, and soffice service and restart only jboss ... tell me if it solves the problem.

Re: Error in application/msword to PDF conversion

PostPosted:Thu Dec 02, 2010 2:26 pm
by Erno
No, neither me nor Debian has started openoffice service, but it was started by OpenKM.
Code: Select all
# /etc/init.d/openkm stop
(few minutes later)

# ps aux|grep java
root     32027  0.0  0.0   5164   780 pts/1    R+   15:24   0:00 grep java
# ps aux|grep soff
root     32030  0.0  0.0   5164   784 pts/1    S+   15:24   0:00 grep soff

# /etc/init.d/openkm start
(few minutes later)
# ps aux|grep java
root     32058  106 14.0 1547848 561148 pts/1  Sl   15:24   0:49 java -Dprogram.name=run.sh -server -Xms256m -Xmx1024m -XX:PermSize=64m -XX:MaxPermSize=128m -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -Djava.endorsed.dirs=/web/web/host/foobar.hu/openkm50rc1/lib/endorsed -classpath /web/web/host/foobar.hu/openkm50rc1/bin/run.jar org.jboss.Main -b 0.0.0.0
# ps aux|grep soff
root     32167 18.0  1.0 277392 43116 pts/1    Sl   15:25   0:00 /usr/lib/openoffice/program/soffice.bin -accept=socket,host=127.0.0.1,port=2002;urp; -env:UserInstallation=file:///tmp/.jodconverter_socket_host-127.0.0.1_port-2002 -headless -nocrashreport -nodefault -nofirststartwizard -nolockcheck -nologo -norestore
It can be seen in the logs as well, that soffice was started by OpenKM.

(EDIT: stripped hostname)

Re: Error in application/msword to PDF conversion

PostPosted:Thu Dec 02, 2010 11:18 pm
by pavila
So the problem is the OpenOffice conversion? I mean, can you convert a .doc or .odt document to PDF?