Page 1 of 1

XML enable indexing text in windows server

PostPosted:Sat Sep 21, 2013 10:33 am
by kknd
Hi !!!

My XML files are not indexed, how do I activate?

I noticed in setting these line:

"org.apache.jackrabbit.extractor.XMLTextExtractor"

but does not work, and i use a windows

See:
Image

Original XML part file:
Code: Select all
    <transp>
      <modFrete>0</modFrete>
      <transporta>
        <CNPJ>9917123400191</CNPJ>
        <xNome>Distribuidora.</xNome>
        <IE>171999999119</IE>
        <xEnder>rua do centro</xEnder>
        <xMun>SAO PAULO</xMun>
        <UF>SP</UF>
      </transporta>
      <veicTransp>
        <placa>BXI1717</placa>
        <UF>SP</UF>
        <RNTC>123123789</RNTC>
      </veicTransp>
      <reboque>
        <placa>BXI112318</placa>
        <UF>SP</UF>
        <RNTC>12123789</RNTC>
      </reboque>
      <vol>
        <qVol>10000</qVol>
        <esp>CASDA</esp>
        <marca>LIASYA</marca>
        <nVol>500</nVol>
        <pesoL>1000000000.000</pesoL>
        <pesoB>1200000000.000</pesoB>
        <lacres>
          <nLacre>XYZ23423486</nLacre>
        </lacres>
      </vol>
    </transp>
    <infAdic>
      <infAdFisco>de exemplo</infAdFisco>
    </infAdic>

Re: XML enable indexing text in windows server

PostPosted:Sun Sep 22, 2013 9:21 am
by jllort
Which version of openkm are you using ?

Re: XML enable indexing text in windows server

PostPosted:Sun Sep 22, 2013 9:09 pm
by kknd
openkm-6.2.4-community-windows

=]

Re: XML enable indexing text in windows server

PostPosted:Tue Sep 24, 2013 6:13 am
by kknd
in registered.text.extractors
Code: Select all
 	org.apache.jackrabbit.extractor.PlainTextExtractor
org.apache.jackrabbit.extractor.MsWordTextExtractor
org.apache.jackrabbit.extractor.MsExcelTextExtractor
org.apache.jackrabbit.extractor.MsPowerPointTextExtractor
org.apache.jackrabbit.extractor.OpenOfficeTextExtractor
org.apache.jackrabbit.extractor.RTFTextExtractor
org.apache.jackrabbit.extractor.HTMLTextExtractor
org.apache.jackrabbit.extractor.XMLTextExtractor
org.apache.jackrabbit.extractor.PngTextExtractor
org.apache.jackrabbit.extractor.MsOutlookTextExtractor
com.openkm.extractor.PdfTextExtractor
com.openkm.extractor.AudioTextExtractor
com.openkm.extractor.ExifTextExtractor
com.openkm.extractor.CuneiformTextExtractor
com.openkm.extractor.SourceCodeTextExtractor
com.openkm.extractor.MsOffice2007TextExtractor 

Re: XML enable indexing text in windows server

PostPosted:Fri Sep 27, 2013 3:53 am
by kknd
help-me please :/

the msg in log

Image

Re: XML enable indexing text in windows server

PostPosted:Sat Sep 28, 2013 3:02 pm
by jllort
Can you test in our online demo if problem happens there too. And if it happens indicate the file path, I would like to see the contents if there're some reason why the parser does not like it.

Re: XML enable indexing text in windows server

PostPosted:Sat Sep 28, 2013 10:54 pm
by kknd

Re: XML enable indexing text in windows server

PostPosted:Sun Sep 29, 2013 7:57 am
by jllort
I've tested with other xml and seems goes right, you can see at http://demo.openkm.com/OpenKM/index.jsp ... 59da2612f3

I take a look into xml and seems is signed document. I suspect could be some error in xml or similar that could cause when parser goes across xml tag find the error and system goes to some Exception. Basically xml extractor goes across all nodes and parse only values or attributes not xml tags ... if he find some error then break. I think is what's happening.

Re: XML enable indexing text in windows server

PostPosted:Mon Sep 30, 2013 9:11 am
by pavila
I have found a couple of errors in the NFe_falhaSchema.xml document:

- It has several spaces before "<?xml version="1.0" encoding="utf-8"?>" and the XML parser does not like that.
- The ending "</NFe>" is missing and the XML parser can't validate it.