Mochabomb

Web Design and Technical notes

Mochabomb header image 2

Script to monitor Linux server processes

May 6th, 2008 10:52 am · 2 Comments

Ever had a process die and not know it until trying to use it? Last year dovecot was dying and running “service dovecot status” shows that - this script was born to address this for many processes. Its a work in progress, and please share any ways to make it better. Cron this to run every 15 minutes, and minor adjustments will be needed for other systems, daemons.

Note: this was written for RHEL systems - make minor adjustments for other systems/daemons

The textfile is located here and the config file looks like:

#
email: user@example.com
pager: 4155551212@messaging.sprintpcs.com
# service: commented_out
service: httpd
service: dovecot
service: mysqld
service: postfix
service: sshd
service: MailScanner
service: proftpd
service: syslog

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • Digg
  • Slashdot
  • Technorati
  • MisterWong
  • Reddit

Tags: Linux

2 responses so far ↓

  • 1 gregg // May 7, 2008 at 7:06 pm

    The Script itself

    #!/bin/ksh
    #
    # Servercheck - thats all it does -
    # Script to check the server health for various processes
    #
    # Written from scratch by Gregg Lain 10/13/2007 gregg@mochabomb.com
    #
    ###########################################################################################################
    #
    # Variables
    #
    email=`grep email /usr/local/ervercheck.conf | sed ’s/email: //’`
    pager=`grep pager /usr/local/servercheck.conf | sed ’s/pager: //’`
    timestamp=`date ‘+%m-%d-%y-%H:%M:%S’`
    logdir=”/var/log”
    tmpfile=”$logdir/$timestamp”
    touch $tmpfile
    Server=`hostname`
    #
    ##########################################################################################################
    #
    # Initialize the incident flag - since we run with a cron, this is not in a loop…
    incidentflag=0
    #

    ##########################################################################################################
    #
    # 1. Check out a process function
    #
    for procstatus in `grep service /usr/local/servercheck.conf | egrep -v ‘^#’ | sed ’s/service: //’`; do
    status=`/sbin/service $procstatus status`
    echo $status | egrep ‘running|OK’ 1> /dev/null
    if [ $? -ne 0 ]; then # something is not running
    # incidentflag=$(($incidentflag+1))
    incidentflag=1
    echo “(servercheck) $Server: $procstatus not running: $timestamp ” >> $tmpfile
    mail $email -s “$Server: $procstatus not running” < $tmpfile
    /sbin/service $procstatus restart
    restart=$?
    servicePID=`ps -ef | grep $procstatus | egrep -v grep | head -1 | awk {'print $2'}`
    if [ $restart -ne 0 ]; then # something cannot be started
    echo "Alert! $Server $procstatus cannot be started" > $tmpfile
    echo “Restart $procstatus via shell or webmin” >> $tmpfile
    mail $email -s “** Alert ** $Server: $procstatus cannot be started” < $tmpfile
    fi
    if [ $restart -eq 0 ]; then # something was restarted
    echo "$Server $procstatus re-started" > $tmpfile
    echo “$procstatus running with PID of $servicePID” >> $tmpfile
    mail $email -s “$Server $procstatus restarted successfully” < $tmpfile
    fi
    fi
    done
    #
    #
    ########################################################################################################
    #
    # 3. Garbage collection
    #
    cat $tmpfile >> /var/log/servercheck
    rm $tmpfile

  • 2 gregg // May 7, 2008 at 7:08 pm

Leave a Comment