• Openkm/Catalina not stopping

  • We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
We tried to make OpenKM as intuitive as possible, but an advice is always welcome.
Forum rules: Please, before asking something see the documentation wiki or use the search feature of the forum. And remember we don't have a crystal ball or mental readers, so if you post about an issue tell us which OpenKM are you using and also the browser and operating system version. For more info read How to Report Bugs Effectively.
 #45644  by dferguson
 
I am having a reoccurring issue where OpenKM isn't stopping as it should during backups and this morning I found the site was down. A simple service tomcat start/stop would not reset the instance. I had to "kill -9 PID" to kill the java PIDs holding the ports used for OpenKM.

I am not sure where to look for clues, but here is what I am seeing in the syslog at the time of backup...
Code: Select all
Apr 11 00:00:01 RDDocMan CRON[1891]: (root) CMD (/root/backup.sh | tee /root/logs/backup.$(date +%Y.%m.%d_%H.%M.%S).log >/dev/null 2>&1)
Apr 11 00:00:02 RDDocMan systemd[1]: Started Session 4 of user openkm.
When I observed that the site was down this morning I tried to stop the service and this was in the syslog for that...
Code: Select all
Apr 11 08:28:00 RDDocMan systemd[1]: Stopping LSB: Start and stop Apache Tomcat...
Apr 11 08:28:00 RDDocMan systemd[1]: Started Session c6 of user openkm.
Apr 11 08:28:00 RDDocMan tomcat[15115]: Stopping TomcatPID file found but no matching process was found. Stop aborted.
Apr 11 08:28:05 RDDocMan AptDaemon.Worker: INFO: Finished transaction /org/debian/apt/transaction/98792cb976224b8babab9f99305a8cec
Apr 11 08:28:05 RDDocMan org.debian.apt[828]: 08:28:05 AptDaemon.Worker [INFO]: Finished transaction /org/debian/apt/transaction/98792cb976224b8babab9$
Apr 11 08:32:46 RDDocMan tomcat[15115]: ..........................................................
Apr 11 08:32:46 RDDocMan systemd[1]: Stopped LSB: Start and stop Apache Tomcat.
I have learned to kill the PIDs holding this java ports used by OpenKM, so at this point I searched for the PIDs using a 'netstat -tupln' and killed the java PID holding two of the ports. Followed by a 'service tomcat start'

The start up if that in the syslog...
Code: Select all
Apr 11 08:33:05 RDDocMan systemd[1]: Starting LSB: Start and stop Apache Tomcat...
Apr 11 08:33:05 RDDocMan systemd[1]: Started Session c7 of user openkm.
Apr 11 08:33:05 RDDocMan tomcat[16129]: Starting Tomcat.
Apr 11 08:33:05 RDDocMan systemd[1]: Started LSB: Start and stop Apache Tomcat.
Apr 11 08:33:05 RDDocMan tomcat[16129]: Tomcat started.
For reference the backup log...
Code: Select all
### BEGIN: 04/11/2018 12:00:01 AM ###

Stopping TomcatPID file found but no matching process was found. Stop aborted.
......................................................................................................................................................$
* Backuping MySQL data from okmdb...
-------------------------------------

Number of files: 69,246 (reg: 24,851, dir: 44,395)
Number of created files: 4,010 (reg: 3,623, dir: 387)
Number of deleted files: 231 (reg: 168, dir: 63)
Number of regular files transferred: 3,632
Total file size: 8.61G bytes
Total transferred file size: 130.55M bytes
Literal data: 130.55M bytes
Matched data: 0 bytes
File list size: 196.59K
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 47.52M
Total bytes received: 145.08K

sent 47.52M bytes  received 145.08K bytes  274.73K bytes/sec
total size is 8.61G  speedup is 180.67
Clean Tomcat temporal files.
Starting Tomcat.

### END: 04/11/2018 08:35:59 AM ###
Calling the tomcat_backup.sh script
Existing PID file found during start.
/home/openkm/ABTopenkm/
/home/openkm/ABTopenkm/db/
/home/openkm/ABTopenkm/db/mysql_okmdb.sql
Tomcat appears to still be running with PID 16145. Start aborted.
/home/openkm/ABTopenkm/home/
/home/openkm/ABTopenkm/home/openkm/
/home/openkm/ABTopenkm/home/openkm/tomcat-7.0.61/
/home/openkm/ABTopenkm/home/openkm/tomcat-7.0.61/OPENKM-README.txt
/home/openkm/ABTopenkm/home/openkm/tomcat-7.0.61/webapps/
/home/openkm/ABTopenkm/home/openkm/tomcat-7.0.61/webapps/ROOT/
Here are my backup scripts...

backup.sh called by cron...
Code: Select all
   
#!/bin/bash
#
## BEGIN CONFIG ##
HOST=$(uname -n)
DATABASE_PASS="abt511"
OPENKM_DB="okmdb"
OPENKM_HOME="/home/openkm"
TOMCAT_HOME="$OPENKM_HOME/tomcat-7.0.61"
DATABASE_EXP="$OPENKM_HOME/ABTopenkm/db"
BACKUP_DIR="/home/openkm/ABTopenkm"
RSYNC_OPTS="-apzhR --stats --delete --exclude=*~ --delete-excluded"
## END CONFIG ##

# Check root user 
if [ $(id -u) != 0 ]; then echo "You should run this script as root"; exit; fi

# Delete older local database backup 
echo -e "### BEGIN: $(date +"%x %X") ###\n"
rm -rf $DATABASE_EXP
mkdir -p $DATABASE_EXP
 
# Mount disk
#if mount | grep "$BACKUP_DIR type" > /dev/null; then
#  echo "$BACKUP_DIR already mounted";
#else
#  mount "$BACKUP_DIR";
#  
#  if mount | grep "$BACKUP_DIR type" > /dev/null; then
#    echo "$BACKUP_DIR mounted";
#  else
#    echo "$BACKUP_DIR error mounting";
#    exit -1;
#  fi
#fi
#
# Stop Tomcat
/etc/init.d/tomcat stop
#
# Backup database
if [ -n "$DATABASE_PASS" ]; then
  echo "* Backuping MySQL data from $OPENKM_DB..."
  mysqldump -h localhost -u root -p$DATABASE_PASS $OPENKM_DB > $DATABASE_EXP/mysql_$OPENKM_DB.sql
  echo "-------------------------------------";
fi
#
# Create backup
rsync $RSYNC_OPTS $TOMCAT_HOME $BACKUP_DIR/
#
# Clean logs
echo "Clean Tomcat temporal files."
rm -rf $TOMCAT_HOME/logs/*
rm -rf $TOMCAT_HOME/temp/*
rm -rf $TOMCAT_HOME/work/Catalina/localhost
#
# Start Tomcat
/etc/init.d/tomcat start
echo -e "\n### END: $(date +"%x %X") ###"
#
# Umount disk
#sync
#umount "$BACKUP_DIR"
#
# run tomcat_backup.sh
echo "Calling the tomcat_backup.sh script"
sh /home/openkm/tomcat-7.0.61/tomcat_backup.sh
echo "tomcat_backup.sh complete"
#

tomcat_backup.sh...
Code: Select all
#!/bin/bash
#
#
# tar up tomcat application folder
#
cd /home/openkm/
#tar -czvf /home/openkm/DatabaseArchives/tomcat-$(date +%Y%m%d).tar.gz tomcat-7.0.61
tar -czvf /home/openkm/DatabaseArchives/ABTopenkmBackup-$(date +%Y%m%d).tar.gz /home/openkm/ABTopenkm
#
# mount USB drive
#
sudo mount /dev/sdb1 /mnt/backup
echo "backup mounted"
sudo cp /home/openkm/DatabaseArchives/ABTopenkmBackup-$(date +%Y%m%d).tar.gz /mnt/backup
echo "move sql dump and app folder tar"
find /mnt/backup -mtime +20 -mtime -100 -type f -exec rm -f {} \;
echo "purge old files on backup drive"
sudo sync
sudo umount /mnt/backup
echo "unmount backup"
#
# mount abt server
#
sudo mount -t cifs -o username=username,password=******** //192.168.255.3/"ABT Server"/"Project Management"/DatabaseBackups /mnt/DatabaseBackups
echo "mount ABT server"
sudo rm -rf /mnt/DatabaseBackups/*
echo "purge old file"
sudo cp /home/openkm/DatabaseArchives/ABTopenkmBackup-$(date +%Y%m%d).tar.gz /mnt/DatabaseBackups
echo "copy database tar"
sudo sync
sudo umount /mnt/DatabaseBackups
echo "unmount ABT server"
#
# delete backups older than 5 days
find /home/openkm/DatabaseArchives -mtime +5 -mtime -100 -type f -exec rm -f {} \;
echo "purge old files locally"
#
#
 #45650  by jllort
 
Is your start & stop script similar than this one https://docs.openkm.com/kcenter/view/ok ... asaservice ?

did you take a look at backup scripts section in our documentation:
https://docs.openkm.com/kcenter/view/ok ... -tool.html
https://docs.openkm.com/kcenter/view/ok ... ackup.html

Is quite strange executing two scripts for the backup, consider doing all into single one.
 #45656  by dferguson
 
Here is my tomcat script...
Code: Select all
#!/bin/sh
 
### BEGIN INIT INFO
# Provides:          tomcat
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start and stop Apache Tomcat
# Description:       Enable Apache Tomcat service provided by daemon.
### END INIT INFO
 
ECHO=/bin/echo
TEST=/usr/bin/test
TOMCAT_USER=openkm
TOMCAT_HOME=/home/openkm/tomcat-7.0.61
TOMCAT_START_SCRIPT=$TOMCAT_HOME/bin/startup.sh
TOMCAT_STOP_SCRIPT=$TOMCAT_HOME/bin/shutdown.sh
 
$TEST -x $TOMCAT_START_SCRIPT || exit 0
$TEST -x $TOMCAT_STOP_SCRIPT || exit 0
 
start() {
    $ECHO -n "Starting Tomcat"
    su - $TOMCAT_USER -c "$TOMCAT_START_SCRIPT &"
    $ECHO "."
}
 
stop() {
    $ECHO -n "Stopping Tomcat"
    su - $TOMCAT_USER -c "$TOMCAT_STOP_SCRIPT 60 -force &"
    while [ "$(ps -fu $TOMCAT_USER | grep java | grep tomcat | wc -l)" -gt "0" ]; do
        sleep 5; $ECHO -n "."
    done
    $ECHO "."
}
 
case "$1" in
    start)
        start
        ;;
    stop)
        stop
        ;;
    restart)
        stop
        sleep 30
        start
        ;;
    *)
        $ECHO "Usage: tomcat {start|stop|restart}"
        exit 1
esac
exit 0
I have plans to condense both backup scripts to one. However I have not had a chance to validate, so I am keeping the second in use for the short term. How could that effect the start/stop of catalina... Am I missing something?
 #45675  by jllort
 
In the first script, you start-stop tomcat. In the second script ( I do not know when is called seems you are copying tomcat files ). Well, I do not know why having two scripts, my suggestion is to merge the second script as part of the first, before starting the tomcat again.

You can also try to start-stop tomcat from the terminal to check if stops correctly or really it has some problem with it. There's a timeout in the script what at the ends passed 60 seconds try to force the shutdown. In some cases zombie process ( soffice, tesseract, or image conversion might be the reason why is not stopping ). I suggest at the end of your working day try to stop yourself the application to verify all is going right.
 #46311  by dferguson
 
This has started to happen again after a long period of proper execution. What I have found is that the catalina PID file says the PID is 1069, however the PIDs listed in netstat -tupln are 1068. I assume this is why the backup script say it found the PID file (1069) but can't find the PID because it is being listed at 1068. Why would this happen?
 #46333  by dferguson
 
If the PID has swapped, stopping Tomcat by hand takes forever. When it finally finishes I manually delete the PID file and a restart puts everything back to normal. I have found that if I don't delete the PID file before restarting the behavior gets unreliable. However with a PID file manual delete and restart of Tomcat, it seems to operate normally until the next time the PID swaps.

About Us

OpenKM is part of the management software. A management software is a program that facilitates the accomplishment of administrative tasks. OpenKM is a document management system that allows you to manage business content and workflow in a more efficient way. Document managers guarantee data protection by establishing information security for business content.