Ayuda de vídeo de Screencast

Application agent falsely detect NetWorker process as offline even when the process is running properly

Created: 15 Enero 2013 • Updated: 15 Enero 2013 | 3 comments
el cuadro de los omiot

Hi,

I've problem with NetWorker in VCS cluster. VCS kill process and restart it on second node. I've truned on debug in my Application log and I can see that monitor process return state:Offline.

2013/01/15 14:33:18 VCS DBG_2 V-16-50-0 Application:nw_server:monitor:Command prepared for getting pid is </bin/ps --cols=100000 --User=root -o pid,args | /bin/egrep '/usr/sbin/nsrd -k clusterFQDN\.domain\.com' | /bin/egrep -v /bin/grep | /usr/bin/tr -s " " " " | /bin/sed -e 's/^ //' | /bin/cut -f1 -d" ">.
        Application.C:processExists[583]
2013/01/15 14:33:19 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:Process:/usr/sbin/nsrd -k clusterFQDN.domain.com; return state: Offline.
        Application.C:application_monitor[300]
2013/01/15 14:38:09 VCS DBG_1 V-16-50-0 Application:nw_server:monitor:UseSUDash:<0>.
        Application.C:application_monitor[163]
2013/01/15 14:38:09 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:User Shell is other than csh, returning 0
        Application.C:getuserinfo[1198]
2013/01/15 14:38:19 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:MonitorProgram returned state:110.
        Application.C:monitorState[920]
2013/01/15 14:38:19 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:return state:STATE_TRUE
        Application.C:monitorState[974]
2013/01/15 14:38:19 VCS DBG_1 V-16-50-0 Application:nw_server:monitor:Total number of Pid Files specified:0.
        Application.C:application_monitor[231]
2013/01/15 14:38:19 VCS DBG_1 V-16-50-0 Application:nw_server:monitor:Total number of Processes specified:<1>.
        Application.C:application_monitor[272]
2013/01/15 14:38:19 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:Process:</usr/sbin/nsrd -k clusterFQDN.domain.com>; User:<root>.
        Application.C:processExists[479]
2013/01/15 14:38:19 VCS DBG_2 V-16-50-0 Application:nw_server:monitor:Command prepared for getting pid is </bin/ps --cols=100000 --User=root -o pid,args | /bin/egrep '/usr/sbin/nsrd -k clusterFQDN\.domain\.com' | /bin/egrep -v /bin/grep | /usr/bin/tr -s " " " " | /bin/sed -e 's/^ //' | /bin/cut -f1 -d" ">.
        Application.C:processExists[583]
2013/01/15 14:38:20 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:Process:/usr/sbin/nsrd -k clusterFQDN.domain.com; return state: Offline.
 
I'm using Storage Foundation for HA ver 5.1 SP1 RP3 on RHEL 5.5.
 
Regards
Pawel

Comentarios ComentariosIr al último comentario

el cuadro de los Marianne

Please post main.cf section for this service group.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

el cuadro de los omiot

Hi,

Thanks for your replay. In attachment I put a piece of my main.cf.

Regards.

Pawel

Archivo adjuntoTamaño
main.zip 1.18 el LAN switch
el cuadro de los Marianne

Please double-check your documentation for the MonitorProcess:

MonitorProcesses = { "/usr/sbin/nsrd -k clusterFQDN" }

should clusterFQDN possibibly the Virtual hostname? 

What does 'ps -ef |grep nsrd' show?

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links