Application agent falsely detect NetWorker process as offline even when the process is running properly
Created: 15 Jan 2013 | Updated: 15 Jan 2013 | 3 comments
Hi,
I've problem with NetWorker in VCS cluster. VCS kill process and restart it on second node. I've truned on debug in my Application log and I can see that monitor process return state:Offline.
2013/01/15 14:33:18 VCS DBG_2 V-16-50-0 Application:nw_server:monitor:Command prepared for getting pid is </bin/ps --cols=100000 --User=root -o pid,args | /bin/egrep '/usr/sbin/nsrd -k clusterFQDN\.domain\.com' | /bin/egrep -v /bin/grep | /usr/bin/tr -s " " " " | /bin/sed -e 's/^ //' | /bin/cut -f1 -d" ">.
Application.C:processExists[583]
2013/01/15 14:33:19 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:Process:/usr/sbin/nsrd -k clusterFQDN.domain.com; return state: Offline.
Application.C:application_monitor[300]
2013/01/15 14:38:09 VCS DBG_1 V-16-50-0 Application:nw_server:monitor:UseSUDash:<0>.
Application.C:application_monitor[163]
2013/01/15 14:38:09 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:User Shell is other than csh, returning 0
Application.C:getuserinfo[1198]
2013/01/15 14:38:19 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:MonitorProgram returned state:110.
Application.C:monitorState[920]
2013/01/15 14:38:19 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:return state:STATE_TRUE
Application.C:monitorState[974]
2013/01/15 14:38:19 VCS DBG_1 V-16-50-0 Application:nw_server:monitor:Total number of Pid Files specified:0.
Application.C:application_monitor[231]
2013/01/15 14:38:19 VCS DBG_1 V-16-50-0 Application:nw_server:monitor:Total number of Processes specified:<1>.
Application.C:application_monitor[272]
2013/01/15 14:38:19 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:Process:</usr/sbin/nsrd -k clusterFQDN.domain.com>; User:<root>.
Application.C:processExists[479]
2013/01/15 14:38:19 VCS DBG_2 V-16-50-0 Application:nw_server:monitor:Command prepared for getting pid is </bin/ps --cols=100000 --User=root -o pid,args | /bin/egrep '/usr/sbin/nsrd -k clusterFQDN\.domain\.com' | /bin/egrep -v /bin/grep | /usr/bin/tr -s " " " " | /bin/sed -e 's/^ //' | /bin/cut -f1 -d" ">.
Application.C:processExists[583]
2013/01/15 14:38:20 VCS DBG_4 V-16-50-0 Application:nw_server:monitor:Process:/usr/sbin/nsrd -k clusterFQDN.domain.com; return state: Offline.
I'm using Storage Foundation for HA ver 5.1 SP1 RP3 on RHEL 5.5.
Regards
Pawel
Discussion Filed Under:
Comments 3 Comments • Jump to latest comment
Please post main.cf section for this service group.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Hi,
Thanks for your replay. In attachment I put a piece of my main.cf.
Regards.
Pawel
Please double-check your documentation for the MonitorProcess:
MonitorProcesses = { "/usr/sbin/nsrd -k clusterFQDN" }
should clusterFQDN possibibly the Virtual hostname?
What does 'ps -ef |grep nsrd' show?
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Would you like to reply?
Login or Register to post your comment.