VCS INFO V-16-20002-211 (t5220-52) Netlsnr failure
Hi all,
I had a netlsnr resource failure this afternoon and cannot identify the root cause:
2011/06/15 11:37:15 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/06/20 08:45:38 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/06/21 19:19:46 VCS ERROR V-16-20002-204 (t5220-52) Oracle:ora_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was specified
2011/06/21 19:19:49 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/06/25 06:00:03 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/07/10 19:55:02 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCP:monitor:Invalid owner oraWPCP for Oracle executables was speci
fied
2011/07/10 23:05:47 VCS ERROR V-16-20002-204 (t5220-52) Oracle:ora_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was specified
2011/07/10 23:05:50 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/07/23 21:21:26 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/07/31 04:34:25 VCS ERROR V-16-20002-204 (t5220-52) Oracle:ora_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was specified
2011/08/02 13:35:12 VCS INFO V-16-20002-211 (t5220-52) Netlsnr:listner_WPCS:monitor:Monitor procedure /opt/VRTSvcs/bin/Netlsnr/LsnrTest.pl
returned the output: su: Unknown id: oraWPCS
2011/08/02 13:35:12 VCS ERROR V-16-2-13067 (t5220-52) Agent is calling clean for resource(listner_WPCS) because the resource became OFFLINE
unexpectedly, on its own.
2011/08/02 13:35:12 VCS NOTICE V-16-20002-42 (t5220-52) Netlsnr:listner_WPCS:clean:Listener(LISTENER) kill TERM 7297
2011/08/02 13:35:23 VCS INFO V-16-2-13068 (t5220-52) Resource(listner_WPCS) - clean completed successfully.
The user oraWPCS is a local user so I don't understand why VCS couldn't identify the user. There is a symantec technote with similar behaviour to this:http://www.symantec.com/docs/TECH146700 but this technote is specifically related to non-global zones whereas this issue in question has occurred within a global zone.
We have noticed that the Netlsnr resource has the 'ContainerName' attribute set to 'Global' and we wonder if this might be affecting it in some way:
# hares -display listner_WPCS
#Resource Attribute System Value
listner_WPCS Group global adams_prep_sg
listner_WPCS Type global Netlsnr
listner_WPCS AutoStart global 1
listner_WPCS Critical global 0
listner_WPCS Enabled global 1
listner_WPCS LastOnline global t5220-52
listner_WPCS MonitorOnly global 0
listner_WPCS ResourceOwner global unknown
listner_WPCS TriggerEvent global 0
listner_WPCS ArgListValues t5220-52 oraWPCS /WPCS/u01/app/oracle/product/10.2/db_1 /WPCS/u01/app/oracle/product/10.2/db_1/network/admin LISTENER "" ./bin/Netlsnr/LsnrTest.pl "" 0 ""
listner_WPCS ConfidenceLevel t5220-52 100
listner_WPCS Flags t5220-52
listner_WPCS IState t5220-52 not waiting
listner_WPCS Probed t5220-52 1
listner_WPCS Start t5220-52 1
listner_WPCS State t5220-52 ONLINE
listner_WPCS AgentDebug global 0
listner_WPCS ComputeStats global 0
listner_WPCS ContainerName global
listner_WPCS Encoding global
listner_WPCS EnvFile global
listner_WPCS Home global /WPCS/u01/app/oracle/product/10.2/db_1
listner_WPCS Listener global LISTENER
listner_WPCS LsnrPwd global
listner_WPCS MonScript global ./bin/Netlsnr/LsnrTest.pl
listner_WPCS Owner global oraWPCS
listner_WPCS ResourceInfo global State Stale Msg TS
listner_WPCS TnsAdmin global /WPCS/u01/app/oracle/product/10.2/db_1/network/admin
listner_WPCS MonitorTimeStats t5220-52 Avg 0 TS
Any ideas are much appreciated.
Comments
Does oraWPCS exist in
Does oraWPCS exist in /etc/passwd on all cluster nodes with same user id?
Please also mention OS and VCS version.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links
Enter subject (optional)
what version of VCS are you using - if 5.1 then you have wrong types file as you should not have a container attribute for 5.1 but this is ok for 5.0
Mike
UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows
If this post has helped you, please vote or mark as solution
Sorry guys, os is solaris 10,
Sorry guys, os is solaris 10, VCS is 5.0
I'm a UNIX nut!
Forgot to mention that the
Forgot to mention that the adams_prep_sg is only configured to come online on t5220-52 and there is an entry in /etc/passwd on t5220-52 for user oraWPCS. Does it matter about the second node (t5220-51) if the service group is not set to come online on that node?
I'm a UNIX nut!
Is there only a single node
Is there only a single node name in SystemList for this service group?
Resources need to be probed on all systems in SystemList, not only the one where it must be onlined.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links
Correct, there is only
Correct, there is only t5220-52 in the SystemList for this service group.
I'm a UNIX nut!
If t5220-52 is only system
UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows
If this post has helped you, please vote or mark as solution
su - oraWPCS works fine
Also performed a getent passwd | grep oraWPCS which returned successfully. There is suggestion that this is a timing issue with LDAP although the entry in /etc/nsswitch.conf is:
# cat /etc/nsswitch.conf
#
# Copyright 2006 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
#
# ident "@(#)nsswitch.files 1.14 06/05/03 SMI"
#
# /etc/nsswitch.files:
#
# An example file that could be copied over to /etc/nsswitch.conf; it
# does not use any naming service.
#
# "hosts:" and "services:" in this file are used only if the
# /etc/netconfig file has a "-" for nametoaddr_libs of "inet" transports.
passwd: files ldap
group: files ldap
hosts: files dns
ipnodes: files
networks: files
protocols: files
rpc: files
ethers: files
netmasks: files
bootparams: files
publickey: files
# At present there isn't a 'files' backend for netgroup; the system will
# figure it out pretty quickly, and won't use netgroups at all.
netgroup: ldap
automount: files ldap
aliases: files
services: files
printers: user files
auth_attr: ldap files
prof_attr: ldap files
project: files ldap
tnrhtp: files
tnrhdb: files
#
I'm a UNIX nut!
Looks like some timing issue.
Looks like some timing issue. Since there are many messages in the engine log
VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was specied
When the above message comes, the resource goes into UNKNOWN. Probably the next monitor cycle the resource gets identified again back as ONLINE. It appears from the logs that this has been happening intermittently. However the Perl test captured it only once and that triggered the fault.
There was a memory leak in
There was a memory leak in the Netlsnr agent that you might be running into. Run 'pmap <pid>' against the Netlsnr agent process. If it's larger than 10MB, then you probably have a memory leak. Consider installing patch 146872-01 which corrects this issue.
-Seann
Is the oraWPCS user an LDAP
Is the oraWPCS user an LDAP account? If so, did you create a "shadowaccount" for the LDAP users? The shadowaccount is usually for non-root users being able to su to an LDAP account. Though the VCS agents usually run as root, but it may not hurt to check/test.
Ref: http://compgroups.net/comp.sys.sun.admin/Solaris-LDAP-and-su-Unknown-id
Memory leak patch
Hi Seann,
I am having trouble locating the memory leak patch or any details regarding this: 146872-01
Can you provide me a link or some directions?
Regards,
I'm a UNIX nut!
Would you like to reply?
Login or Register to post your comment.