Monitoring Linux services

View Only

Back to discussions

Expand all | Collapse all

Jump to Best Answer

1. Monitoring Linux services

0 Recommend
BBC
Posted Nov 24, 2011 11:30 AM

Reply Reply Privately
Hi everyone,

Has anyone a hint how to monitor Linux services in a similar way like it is possible for Windows services? I have been searching around, but have not found anything on that topic that could help me further.

Thanks for any advise.

-BBC
2. RE: Monitoring Linux services
Best Answer

0 Recommend
Migration User
Posted Nov 28, 2011 09:29 AM

Reply Reply Privately
We are using 7.0 and take a targeted approach to monitoring Linux services, as opposed to our current method of monitoring all Windows services that are set to 'Automatic'.

It's a two part process; we gather a custom inventory of servers that are running daemons we want to monitor, and then we have a metric in place that watches only those services to ensure that they are running, and running with the same command line arguments as when the inventory was taken (this will alert if someone has changed parameters).

The first part of the process is to gather then custom inventory. Here is the custom inventory gathering script we use:

. `aex-helper info path -s INVENTORY`/lib/helpers/custominv_inc.sh
#
# Sample script for custom inventory
# The first line of code should be always included at the begin of the script
# Actual script for collecting inventory data begins after the following label:
# SCRIPT_BEGINS_HERE
#!/usr/bin/python
import os
daemonlist = ['LLAWP','ntpd','named','ndsd','crond','xinetd','avagent','postfix','syslog']
try:
    os.makedirs('/opt/altiris/data')
except OSError:
    if os.path.exists('/opt/altiris/data'):
        pass
    else:
        raise
monitoringfile = open('/opt/altiris/data/ci-daemons.mon', 'w')
ps = os.popen("ps axwwl").read()
processes = ps.split('\n')
nfields = len(processes[0].split()) - 1
print "CI_DAEMONS_LINUX" #Put the name of the CI table here...duh.
print "Delimiters=\"+\" "
print "string64 string256" #put the real field values in here when we're ready to roll
print "Application Command_Line"
for row in processes[0:len(processes)-1]:
    proc = row.split(None, nfields)
    if proc[3] == "1": #Check to see if PPID is 1
        executable = proc[-1].split(None)[0].split("/")[-1] #split the command output of ps on spaces, then split the first item on /, the last item should be the executable
        for daemon in daemonlist:
            if executable.find(daemon) > -1:
                print "%s+%s" % (executable, proc[-1])
                monitoringfile.write(executable+"+"+proc[-1]+"\n")
monitoringfile.close()

This will load the custom inventory table with the process name found on the server, as well as the entire command line. In addition to loading the CI table, it also leaves behind a file '/opt/altiris/data/ci-daemons.mon' which we will use to monitor each server. You can see that the only daemons we are inventorying and monitoring are included in the list variable 'daemonlist'.

The metric we use to monitor the inventoried files is a command metric, which will read the custom inventory file that we left behind on the server and check to make sure that the daemon is running, and has the same parameters as when the inventory was taken. It looks like this:

while read -r i;do q=`echo $i|awk -F'+' '{print $2}'`;process=`echo $i|awk -F'+' '{print $1}'`;check="/var/run/${process}_restart";if [ -f $check ];then echo "Restarting $process";else p=`ps -ewwwwww -o ppid -o cmd | grep "$q"|grep -v grep|awk ' { if($1=="1") print $0} '|uniq|wc -l`;if [ $process = "LLAWP" ];then q=`echo $q|sed s/\ -a//g`;p=`ps -ewwwwww -o ppid -o cmd |sed s/\ -a//g| grep "$q"|grep -v grep|awk ' { if($1=="1") print $0} '|uniq|wc -l`;fi;fullcmd=`echo $q|tr [:blank:] "_"`;if [ $p -ne 1 ];then echo "Error $fullcmd";else echo "Running $fullcmd";fi;fi;done < /opt/altiris/data/ci-daemons.mon

There are a number of unique features that we have included in this script for our environment. The first is that we check for the presence of a file '/var/run/${process}_restart'. We have modified our init scripts for a number of daemons to touch a file in /var/run when they perform a restart so that we do not falsely report that they are down during a scheduled restart time. We also included a provision for the LLAWP daemon (Siteminder) because it will sometimes restart itself and include a '-a' runtime parameter that was causing false alerts.

This monitoring approach presupposes that daemons will be running during the time that we take our daily custom inventory, and also that those daemons should be running at all times.
3. RE: Monitoring Linux services

0 Recommend
BBC
Posted Dec 02, 2011 12:03 PM

Reply Reply Privately
Hi Scott,

Thanks very much for the post, which was very much helpful!

-BBC

Server Management Suite

Monitoring Linux services

BBCNov 24, 2011 11:30 AM

Migration UserNov 28, 2011 09:29 AMBest Answer

BBCDec 02, 2011 12:03 PM

1. Monitoring Linux services

2. RE: Monitoring Linux services Best Answer

3. RE: Monitoring Linux services

2. RE: Monitoring Linux services
Best Answer