Page 1 of 1
					
				Extracción texto PDF
				PostPosted:Tue Jan 14, 2020 3:08 pm
				by gcosta
				Buenas tardes, estamos usando la versión Community 6.3.8 y hemos detectado que no extrae el texto de los archivos pdf.
Si ejecutamos el test del textextractor nos da el siguiente error:
Code: Select allorg.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm cannot be cast to org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage
 
Que nos falta configurar?
Gracias.
 
			 
			
					
				Re: Extracción texto PDF
				PostPosted:Sat Jan 18, 2020 10:47 am
				by jllort
				Puedes compartirnos un fichero PDF que no funcione para realizar un test de nuestro lado ?
Y si es posible la traza completa de el error ( fichero catalina.log )
			 
			
					
				Re: Extracción texto PDF
				PostPosted:Thu Jan 23, 2020 4:00 pm
				by gcosta
				Buenas tardes, gracias por la respuesta. A continuación te mando el registro del log al ejecutar el text extractor test. 
Referente al fichero, si tienes algun sitio privado donde te lo pueda colgar por favor indicame. 
Gracias.
Code: Select allStdErr: 
2020-01-23 16:54:28,866 [http-nio-0.0.0.0-8020-exec-8] [] WARN  com.openkm.util.ReportUtils - Report '7' has no params.xml file
2020-01-23 16:54:57,812 [http-nio-0.0.0.0-8020-exec-2] [] WARN  c.openkm.extractor.PdfTextExtractor - PDF does not contains text layer
2020-01-23 16:54:57,814 [http-nio-0.0.0.0-8020-exec-2] [] WARN  c.openkm.extractor.PdfTextExtractor - Failed to extract PDF text content
java.lang.ClassCastException: org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm cannot be cast to org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage
	at com.openkm.extractor.PdfTextExtractor.extractText(PdfTextExtractor.java:145) ~[classes/:6.3.8]
	at com.openkm.extractor.RegisteredExtractors.getText(RegisteredExtractors.java:164) [classes/:6.3.8]
	at com.openkm.servlet.admin.CheckTextExtractionServlet.doPost(CheckTextExtractionServlet.java:133) [classes/:6.3.8]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:661) [servlet-api.jar:na]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) [servlet-api.jar:na]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) [catalina.jar:8.5.24]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [catalina.jar:8.5.24]
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) [tomcat-websocket.jar:8.5.24]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [catalina.jar:8.5.24]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [catalina.jar:8.5.24]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:118) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:84) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:103) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:113) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:154) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:45) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:199) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:50) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:106) [spring-web-3.2.18.RELEASE.jar:3.2.18.RELEASE]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:192) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:160) [spring-security-web-3.2.10.RELEASE.jar:na]
	at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:343) [spring-web-3.2.18.RELEASE.jar:3.2.18.RELEASE]
	at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:260) [spring-web-3.2.18.RELEASE.jar:3.2.18.RELEASE]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [catalina.jar:8.5.24]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [catalina.jar:8.5.24]
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198) [catalina.jar:8.5.24]
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) [catalina.jar:8.5.24]
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:504) [catalina.jar:8.5.24]
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140) [catalina.jar:8.5.24]
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81) [catalina.jar:8.5.24]
	at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650) [catalina.jar:8.5.24]
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87) [catalina.jar:8.5.24]
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342) [catalina.jar:8.5.24]
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803) [tomcat-coyote.jar:8.5.24]
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) [tomcat-coyote.jar:8.5.24]
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:790) [tomcat-coyote.jar:8.5.24]
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1459) [tomcat-coyote.jar:8.5.24]
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-coyote.jar:8.5.24]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_71]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_71]
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-util.jar:8.5.24]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_71]
2020-01-23 16:55:00,060 [Thread-11759] [] WARN  com.openkm.core.Cron - Crontab task mail address is empty: Return: null
<hr/>
StdOut: 
<hr/>
StdErr: 
2020-01-23 16:55:00,079 [Thread-11760] [] WARN  com.openkm.core.Cron - Crontab task mail address is empty: Return: null
<hr/>
StdOut: 
<hr/>
StdErr: 
2020-01-23 16:55:03,083 [Thread-11758] [] WARN  com.openkm.core.Cron - Crontab task mail address is empty: Return: null
<hr/>
StdOut: 
<hr/>
StdErr: 
  
			 
			
					
				Re: Extracción texto PDF
				PostPosted:Sat Jan 25, 2020 11:10 am
				by jllort
				Contacta con nostros a través de el formulario de contacto indicando la url de el foro ( pero sin el http: de delante o no te dejará enviar la consulta ) y ya nos pondremos en contacto contigo
https://www.openkm.com/es/contacto.html 
			 
			
					
				Re: Extracción texto PDF
				PostPosted:Mon Jan 27, 2020 9:37 am
				by gcosta
				Ok, enviado.
Gracias.
			 
			
					
				Re: Extracción texto PDF
				PostPosted:Thu Jan 30, 2020 6:53 pm
				by jllort
				Si con lo que te hemos respondido directamente por email no termina de funcionarte, indícame que sistema operativo estas utilizando.
			 
			
					
				Re: Extracción texto PDF
				PostPosted:Fri Jan 31, 2020 3:13 pm
				by jllort
				Te sugiero actualizar a la ultima versión ( que saldrá la próxima semana ) y adicionalmente que version de jdk estas utilizando ?
			 
			
					
				Re: Extracción texto PDF
				PostPosted:Fri Jan 31, 2020 3:55 pm
				by gcosta
				ok, la semana próxima actualizo.
Referente a la versión java 1.8.0_71. Actualizada no hace mucho.
Gracias.
			 
			
					
				Re: Extracción texto PDF
				PostPosted:Sat Feb 01, 2020 10:25 am
				by jllort
				Pues esta versión es de el año de la castaña 

 debe tener más de 1-2 años seguro. Te aconsejo que te instales el openjdk ( en Linux ), nosotros después de el cambio de licenciamiento de Oracle con el JDK nos hemos movido a openjdk en todos los entornos ( de hecho en previsión de este cambio ya hace más de un año que empezamos con el cambio ).