• mtime spoils incremental backup of repository

  • We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #12407  by snowman
 
Hello,

I am referring to a thread from 2010 which as died.
I automatically do backups of the repository and database (OpenKM 5.1.7) . I tried to implement incremental backup on the repository but always the full repository is backed up.
I found out that each file of the repository gets an updated atime and mtime. I do not understand what and why modfies mtime since all files in the repo are PDFs.

Does anybody know what touches mtime in the repo every night and maybe why?
Mor important how to turn it off?

Best rgards,
Snowman
 #12429  by pavila
 
Need to be studied in depth because is related to Jackrabbit, which is the repository used by OpenKM to store the documents. Anyway we perform incremental backups with rdiff-backup and rsync under Linux with no problems.
 #12473  by snowman
 
How can I help investigating? I have no idea how jackrabbit works and how it could in their forum.

The service is running but was not touched for several days now. No user interaction, no import.
I have 495 files in the datastore found by "find datastore -type f | wc -w" of which 495 have been modified less than 24 hours ago: "find datastore -type f -mtime 0 | wc -w".

When I "stat" a file exemplary it returns:
Code: Select all
  File: `03/fa/b4/03fab4f1ad8268d2921b743b8b90562dfb4908a9'
  Size: 69761           Blocks: 144        IO Block: 4096   regular file
Device: fd11h/64785d    Inode: 399189      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2011-10-05 01:05:31.419387817 +0200
Modify: 2011-10-05 00:00:07.030000000 +0200
Change: 2011-10-05 00:00:05.028873604 +0200
 Birth: -
The access time was changed by my backup system which is bacula.
The file is originally a pdf.
I have not consciously changed the default configuration regarding the repo.
 #12475  by snowman
 
Today I did another stat on the same file:
Code: Select all
  File: `03/fa/b4/03fab4f1ad8268d2921b743b8b90562dfb4908a9'
  Size: 69761           Blocks: 144        IO Block: 4096   regular file
Device: fd11h/64785d    Inode: 399189      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2011-10-06 01:29:40.506123084 +0200
Modify: 2011-10-06 00:00:08.507000000 +0200
Change: 2011-10-06 00:00:06.505535372 +0200
 Birth: -
Access time originates from my backup. No other interaction was done.
I also tried rsync. It shows the expected behavior. Every file is synced again the next day.

Can anyone tell me where the jackrabbit configuration files are?
 #12550  by pavila
 
Seems to be a collateral effect of the DataGarbageCollector daemon which removed orphan files from DataStore. I will try to find the reason. Created this issue http://issues.openkm.com/view.php?id=1831

In rsync you can bypass the mod-time check using the --checksum parameter. Also take a look at --size-only parameter.

More info on these parameters at Rsync difference between --checksum and --ignore-times options.

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.