Video Screencast Help
Search Video Help Close Back
to help
Not able to make it to Vision this year? Get a sampling in the Best of Vision on Demand group.

VCS INFO V-16-20002-211 (t5220-52) Netlsnr failure

Created: 02 Aug 2011 | 12 comments
jonny_smith's picture
0 0 Votes
Login to vote

Hi all,

I had a netlsnr resource failure this afternoon and cannot identify the root cause:

 

2011/06/15 11:37:15 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/06/20 08:45:38 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/06/21 19:19:46 VCS ERROR V-16-20002-204 (t5220-52) Oracle:ora_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was specified
2011/06/21 19:19:49 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/06/25 06:00:03 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/07/10 19:55:02 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCP:monitor:Invalid owner oraWPCP for Oracle executables was speci
fied
2011/07/10 23:05:47 VCS ERROR V-16-20002-204 (t5220-52) Oracle:ora_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was specified
2011/07/10 23:05:50 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/07/23 21:21:26 VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was speci
fied
2011/07/31 04:34:25 VCS ERROR V-16-20002-204 (t5220-52) Oracle:ora_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was specified
2011/08/02 13:35:12 VCS INFO V-16-20002-211 (t5220-52) Netlsnr:listner_WPCS:monitor:Monitor procedure /opt/VRTSvcs/bin/Netlsnr/LsnrTest.pl
returned the output: su: Unknown id: oraWPCS

2011/08/02 13:35:12 VCS ERROR V-16-2-13067 (t5220-52) Agent is calling clean for resource(listner_WPCS) because the resource became OFFLINE
 unexpectedly, on its own.

2011/08/02 13:35:12 VCS NOTICE V-16-20002-42 (t5220-52) Netlsnr:listner_WPCS:clean:Listener(LISTENER) kill TERM  7297
2011/08/02 13:35:23 VCS INFO V-16-2-13068 (t5220-52) Resource(listner_WPCS) - clean completed successfully.

The user oraWPCS is a local user so I don't understand why VCS couldn't identify the user. There is a symantec technote with similar behaviour to this:http://www.symantec.com/docs/TECH146700 but this technote is specifically related to non-global zones whereas this issue in question has occurred within a global zone. 

We have noticed that the Netlsnr resource has the 'ContainerName' attribute set to 'Global' and we wonder if this might be affecting it in some way:

# hares -display listner_WPCS
#Resource    Attribute        System     Value
listner_WPCS Group            global     adams_prep_sg
listner_WPCS Type             global     Netlsnr
listner_WPCS AutoStart        global     1
listner_WPCS Critical         global     0
listner_WPCS Enabled          global     1
listner_WPCS LastOnline       global     t5220-52
listner_WPCS MonitorOnly      global     0
listner_WPCS ResourceOwner    global     unknown
listner_WPCS TriggerEvent     global     0
listner_WPCS ArgListValues    t5220-52   oraWPCS        /WPCS/u01/app/oracle/product/10.2/db_1  /WPCS/u01/app/oracle/product/10.2/db_1/network/admin    LISTENER        ""      ./bin/Netlsnr/LsnrTest.pl       ""      0       ""
listner_WPCS ConfidenceLevel  t5220-52   100
listner_WPCS Flags            t5220-52
listner_WPCS IState           t5220-52   not waiting
listner_WPCS Probed           t5220-52   1
listner_WPCS Start            t5220-52   1
listner_WPCS State            t5220-52   ONLINE
listner_WPCS AgentDebug       global     0
listner_WPCS ComputeStats     global     0
listner_WPCS ContainerName    global
listner_WPCS Encoding         global
listner_WPCS EnvFile          global
listner_WPCS Home             global     /WPCS/u01/app/oracle/product/10.2/db_1
listner_WPCS Listener         global     LISTENER
listner_WPCS LsnrPwd          global
listner_WPCS MonScript        global     ./bin/Netlsnr/LsnrTest.pl
listner_WPCS Owner            global     oraWPCS
listner_WPCS ResourceInfo     global     State  Stale   Msg             TS
listner_WPCS TnsAdmin         global     /WPCS/u01/app/oracle/product/10.2/db_1/network/admin
listner_WPCS MonitorTimeStats t5220-52   Avg    0       TS

 

Any ideas are much appreciated.

Discussion Filed Under:

Comments

Marianne van den Berg's picture
02
Aug
2011
0 Votes 0
Login to vote

Does oraWPCS exist in

Does oraWPCS exist in /etc/passwd on all cluster nodes with same user id?

Please also mention OS and VCS version.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links

mikebounds's picture
02
Aug
2011
0 Votes 0
Login to vote

Enter subject (optional)

what version of VCS are you using - if 5.1 then you have wrong types file as you should not have a container attribute for 5.1 but this is ok for 5.0

Mike

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has helped you, please vote or mark as solution

jonny_smith's picture
02
Aug
2011
0 Votes 0
Login to vote

Sorry guys, os is solaris 10,

Sorry guys, os is solaris 10, VCS is 5.0

I'm a UNIX nut!

jonny_smith's picture
02
Aug
2011
0 Votes 0
Login to vote

Forgot to mention that the

Forgot to mention that the adams_prep_sg is only configured to come online on t5220-52 and there is an entry in /etc/passwd on t5220-52 for user oraWPCS. Does it matter about the second node (t5220-51) if the service group is not set to come online on that node?

I'm a UNIX nut!

Marianne van den Berg's picture
02
Aug
2011
0 Votes 0
Login to vote

Is there only a single node

Is there only a single node name in SystemList for this service group?

Resources need to be probed on all systems in SystemList, not only the one where it must be onlined.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links

jonny_smith's picture
03
Aug
2011
0 Votes 0
Login to vote

Correct, there is only

Correct, there is only t5220-52 in the SystemList for this service group.

I'm a UNIX nut!

mikebounds's picture
03
Aug
2011
0 Votes 0
Login to vote

  If t5220-52 is only system

 

If t5220-52 is only system in the SystemList then this is the only node that needs to have user oraWPCS.
It looks as though "su: Unknown id: oraWPCS" is the output of su, not VCS, so just try running su - oraWPCS by copying username from error in engine log as perhaps you have a typo.  You could also look in /opt/VRTSvcs/bin/Netlsnr/LsnrTest.pl, which is the script calling the "su" ("grep -i su" as command is probably "$SU in uppercase)
I think the zone issue is probably a red herring - the containerName is not set to global - it is set to blank and global is its scope - i.e it is set globally across systems as oppose to per system (the attribute IState for example is set per system, not globally).
 
Mike

 

UK Symantec Consultant in VCS, GCO, SF, VVR, VxAT on Solaris, AIX, HP-ux, Linux & Windows

If this post has helped you, please vote or mark as solution

jonny_smith's picture
04
Aug
2011
0 Votes 0
Login to vote

su - oraWPCS works fine

Also performed a getent passwd | grep oraWPCS which returned successfully. There is suggestion that this is a timing issue with LDAP although the entry in /etc/nsswitch.conf is:

 

# cat /etc/nsswitch.conf
#
# Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
# Use is subject to license terms.
#
# ident "@(#)nsswitch.files     1.14    06/05/03 SMI"

#
# /etc/nsswitch.files:
#
# An example file that could be copied over to /etc/nsswitch.conf; it
# does not use any naming service.
#
# "hosts:" and "services:" in this file are used only if the
# /etc/netconfig file has a "-" for nametoaddr_libs of "inet" transports.

passwd: files ldap
group:  files ldap
hosts:      files dns
ipnodes:    files
networks:   files
protocols:  files
rpc:        files
ethers:     files
netmasks:   files
bootparams: files
publickey:  files
# At present there isn't a 'files' backend for netgroup;  the system will
#   figure it out pretty quickly, and won't use netgroups at all.
netgroup:      ldap
automount:      files ldap
aliases:    files
services:   files
printers:       user files

auth_attr:      ldap files
prof_attr:      ldap files
project:      files ldap

tnrhtp:     files
tnrhdb:     files
#

I'm a UNIX nut!

Satish K. Pagare's picture
11
Aug
2011
0 Votes 0
Login to vote

Looks like some timing issue.

Looks like some timing issue. Since there are many messages in the engine log

VCS ERROR V-16-20002-204 (t5220-52) Netlsnr:listner_WPCS:monitor:Invalid owner oraWPCS for Oracle executables was specied

When the above message comes, the resource goes into UNKNOWN. Probably the next monitor cycle the resource gets identified again back as ONLINE. It appears from the logs that this has been happening intermittently. However the Perl test captured it only once and that triggered the fault.

Seann Herdejurgen's picture
14
Aug
2011
0 Votes 0
Login to vote

There was a memory leak in

There was a memory leak in the Netlsnr agent that you might be running into.  Run 'pmap <pid>' against the Netlsnr agent process.  If it's larger than 10MB, then you probably have a memory leak.  Consider installing patch 146872-01 which corrects this issue.

-Seann

AlanTLR's picture
19
Sep
2011
0 Votes 0
Login to vote

Is the oraWPCS user an LDAP

Is the oraWPCS user an LDAP account?  If so, did you create a "shadowaccount" for the LDAP users?  The shadowaccount is usually for non-root users being able to su to an LDAP account.  Though the VCS agents usually run as root, but it may not hurt to check/test.

Ref: http://compgroups.net/comp.sys.sun.admin/Solaris-LDAP-and-su-Unknown-id

jonny_smith's picture
30
Sep
2011
0 Votes 0
Login to vote

Memory leak patch

Hi Seann,

 

I am having trouble locating the memory leak patch or any details regarding this: 146872-01

 

Can you provide me a link or some directions?

Regards,

I'm a UNIX nut!