Page 1 of 1
Backing up the repository directory
PostPosted:Wed Jun 23, 2010 1:00 am
by Flogeras
Hello
I use GNU tar to backup the repository directory. I have been trying to use the --listed-incremental option, using weekly snapshots (we have fairly low traffic on our openkm install), but something seems to be changing the mtime and ctime of the files each night at around midnight. I can understand that the system may index periodically, but why is it changing the mtime/ctime. This makes incremental backups impossible for me. Is there something in my configuration that might be wrong?
Thanks,
Dave
Re: Backing up the repository directory
PostPosted:Wed Jun 23, 2010 9:20 am
by jllort
if lucene indexing files are a problem to you, simply do not include it. After restoring, lucene indexing are automatically created .... but please if you decide it ... try restoring this backup type to ensure all is right.
Re: Backing up the repository directory
PostPosted:Wed Jun 23, 2010 11:05 am
by Flogeras
I guess it is more a question of why is it modifying the files when it indexes them? Should I contact the lucene people to investigate if this is a bug?
Re: Backing up the repository directory
PostPosted:Mon Jun 28, 2010 7:44 am
by pavila
Which files are modified? IT also depends on your repository.xml configuration, please post it.
Re: Backing up the repository directory
PostPosted:Mon Jun 28, 2010 12:04 pm
by Flogeras
It seems that each and every file within the repository/repository/datastore directory is modified at (or just after) midnight every day.
Here is my repository.xml (from OpenKM 4.1), it should be pretty standard with the exception of using Postgres for the metadata.
Code: Select all<?xml version="1.0"?>
<!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 1.4//EN"
"http://jackrabbit.apache.org/dtd/repository-1.4.dtd">
<Repository>
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="${rep.home}/repository"/>
</FileSystem>
<Security appName="OpenKM">
<AccessManager class="es.git.openkm.core.OKMAccessManager"/>
<!-- <AccessManager class="org.apache.jackrabbit.core.security.SimpleAccessManager"/> -->
</Security>
<Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="default" />
<Workspace name="${wsp.name}">
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="${wsp.home}"/>
</FileSystem>
<PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.PostgreSQLPersistenceManager">
<param name="driver" value="org.postgresql.Driver"/>
<param name="url" value="jdbc:postgresql://localhost:5432/openkm?autoReconnect=true"/>
<param name="schema" value="postgresql"/>
<param name="user" value="openkm"/>
<param name="password" value=""/>
<param name="schemaObjectPrefix" value="${wsp.name}_"/>
<param name="externalBLOBs" value="false"/>
</PersistenceManager>
<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
<param name="path" value="${wsp.home}/index"/>
<param name="resultFetchSize" value="100"/>
<param name="useCompoundFile" value="true"/>
<param name="minMergeDocs" value="100"/>
<param name="volatileIdleTime" value="3"/>
<param name="maxMergeDocs" value="100000"/>
<param name="mergeFactor" value="10"/>
<param name="maxFieldLength" value="50000"/>
<!-- Disable extractor pool -->
<!-- <param name="extractorPoolSize" value="0"/> -->
<param name="extractorTimeout" value="1000"/>
<param name="bufferSize" value="10"/>
<param name="cacheSize" value="1000"/>
<param name="forceConsistencyCheck" value="false"/>
<!-- <param name="consistencyCheck" value="true"/> -->
<!-- <param name="consistencyFix" value="true"/> -->
<param name="autoRepair" value="true"/>
<!-- <param name="analyzer" value="es.git.openkm.analysis.SpanishAnalyzer"/> -->
<param name="respectDocumentOrder" value="false"/>
<param name="indexingConfiguration" value="${wsp.home}/../../../indexing_configuration.xml"/>
<param name="textFilterClasses" value="
org.apache.jackrabbit.extractor.PlainTextExtractor,
org.apache.jackrabbit.extractor.PdfTextExtractor,
org.apache.jackrabbit.extractor.HTMLTextExtractor,
org.apache.jackrabbit.extractor.XMLTextExtractor,
org.apache.jackrabbit.extractor.RTFTextExtractor,
org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
es.git.openkm.extractor.MsExcelTextExtractor,
es.git.openkm.extractor.MsPowerPointTextExtractor,
es.git.openkm.extractor.MsWordTextExtractor,
es.git.openkm.extractor.MsOffice2007TextExtractor,
es.git.openkm.extractor.ExifTextExtractor,
es.git.openkm.extractor.TiffTextExtractor,
es.git.openkm.extractor.AudioTextExtractor" />
</SearchIndex>
</Workspace>
<Versioning rootPath="${rep.home}/version">
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="${rep.home}/version"/>
</FileSystem>
<PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.PostgreSQLPersistenceManager">
<param name="driver" value="org.postgresql.Driver"/>
<param name="url" value="jdbc:postgresql://localhost:5432/openkm?autoReconnect=true"/>
<param name="schema" value="postgresql"/>
<param name="user" value="openkm"/>
<param name="password" value=""/>
<param name="schemaObjectPrefix" value="version_"/>
<param name="externalBLOBs" value="false"/>
</PersistenceManager>
</Versioning>
<!-- Also see DatabaseDataStore-->
<DataStore class="org.apache.jackrabbit.core.data.FileDataStore"/>
</Repository>
Re: Backing up the repository directory
PostPosted:Sat May 07, 2011 5:34 pm
by snowman
Hello,
is there any update on that issue? I observe the same behavior. Even if I have no external traffic for several days, all my repository files have a mtime stamp of midnight. Inremental backup is ruled out with that.
Best regards,
Snowman
Re: Backing up the repository directory
PostPosted:Sun May 08, 2011 9:32 am
by pavila
Please, try with a more recent OpenKM version because 4.1 is only supported for our costumers. For example OpenKM 5.0.4 or OpenKM 5.1.3