Video Screencast Help
Symantec Appoints Michael A. Brown CEO. Learn more.

Weblogic agent Monitor Issue VCS 6.0.1 Solaris 11

Created: 15 Apr 2013 • Updated: 16 Apr 2013 | 6 comments
SVGA's picture
This issue has been solved. See solution.

after configuring the agent it starts the Node manager Successfully but i keep receiving this warning message :

every minute

 

V-16-55008-20146
(hot422v2) Web Logic:RESOURCE_NAME:monitor:<Web Logic::CheckWLSProcessID>:Pidfile [/tmp/.VRTSWebLogic/<RESOURCE_NAME>.pid] is stale.   

then

V-16-55008-20148
(hot422v2) WebLogic:RESOURCE_NAME:monitor:<WebLogic::CheckWLSProcessID>: Updated PidFile [/tmp/.VRTSWebLogic/RESOURCE_NAME.pid] with [25081].   

 

and it keeps repeating i cannot seem to find this warning V-16-55008-20146 anywhere . any one has an idea.

 

 

 

Operating Systems:

Comments 6 CommentsJump to latest comment

Satish K. Pagare's picture

Is the repeating message shows the same PID being updated all the time to the PidFile?

Satish K. Pagare's picture

Could you please enable TRACE debug mode for WebLogic agent. You could do that by the command:

# hares -modify ResourceName ResLogLevel TRACE

Probe the reosurce for couple of times on the node where its running and the PidFile warnings are appearing.

Reset the debug level

# hares -modify ResourceName ResLogLevel INFO

Now you can go through the /var/VRTSvcs/log/engine_A.log file to debug further. If that doesn't help, please post the relevant snippet of the monitor entry point with the debug logs.

SVGA's picture

man thanks for the reply i canged to trace allthogh engine log stayed the same

2013/04/16 07:53:10 VCS WARNING V-16-55008-20146 (hot441v3) WebLogic:XXXX_WL_NM:monitor:<WebLogic::CheckWLSProcessID>:Pidfile [/tmp/.VRTSWebLogic/maxpap1_WL_NM.pid] is stale.
2013/04/16 07:53:10 VCS NOTICE V-16-55008-20148 (hot441v3) WebLogic:XXXXX_WL_NM:monitor:<WebLogic::CheckWLSProcessID>: Updated PidFile [/tmp/.VRTSWebLogic/XXXX_WL_NM.pid] with [6150].

the WebLogic_A.log had a very good trace for the problem :

what i found in the trace  is that solaris 11 does not have the option for /usr/ucb/ps -ww pid this command gives error which is used to monitor the node manger resorce but i dont know if this is the cause of the issue or not . most probably it is . but i do not have a work around about it do u have an idea

the trace log is below:

2013/04/16 07:34:09 VCS INFO V-16-55000-10346 Resource(XXX_WL_NM) - (hot441v3:monitor) LogInt:SetDebugLevel:Information, Set Logging Level To [TRACE]
2013/04/16 07:34:09 VCS INFO V-16-55000-10267 Resource(XXX_WL_NM) - (hot441v3:monitor) VCSagentFW:SetupLogging:[monitor] Entered by resource instance [XXX_WL_NM]
2013/04/16 07:34:09 VCS INFO V-16-55008-20282 Resource(XXX_WL_NM) - (hot441v3:monitor) Resource found with State [2] and IState [0]
2013/04/16 07:34:09 VCS INFO V-16-55008-20308 Resource(XXX_WL_NM) - (hot441v3:monitor) VCSAG_GET_MONITOR_LEVEL ret=[0] level_one=[1] level_two=[0]
2013/04/16 07:34:09 VCS INFO V-16-55008-20001 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <WebLogic::ArgsValid> called with:
  EntryPointName           [monitor]
  AdminURL                 []
  BEA_HOME                 [/XXXXXX/app/oracle/product/Middleware]
  WlstScript               [/XXXXX/app/oracle/product/Middleware/oracle_common/common/bin/wlst.sh]
  DomainName               [<undef>]
  DomainDir                []
  ListenAddressPort        []
  MonitorProgram           []
  nmListenAddressPort      [XXX.YYY.com.kw:5556]
  nmType                   [ssl]
  nmHome                   [/XXX/app/oracle/product/Middleware/wlserver_10.3/common/nodemanager]
  ServerName               []
  ServerRole               [NodeManager]
  User                     [XXXXX]
  WLSUser                  [XXXXXXXXX]
  WLSPassword              [XXXXXXXXXXXXXXX]
  nmUser                   [XXXXXXXXX]
  nmPassword                [XXXXXXXXXXXXXXXX]
  ServerStartProgram       []
  ServerStopProgram       []
  ShutdownTimeout          [5000]
  RequireAdminServer       [0]
  AdminServerMaxWait       [60]
  SecondLevelMonitor       [0]
2013/04/16 07:34:09 VCS INFO V-16-55008-20002 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::ArgsValid>:time is [1366086849]
2013/04/16 07:34:09 VCS INFO V-16-55008-20005 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::ArgsValid>:This is NodeManager based configuration.
2013/04/16 07:34:09 VCS INFO V-16-55008-20283 Resource(XXX_WL_NM) - (hot441v3:monitor) All arguments validated successfully!
2013/04/16 07:34:09 VCS INFO V-16-55008-20286 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <main::MonitorEntryPoint> called
2013/04/16 07:34:09 VCS INFO V-16-55008-20186 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <WebLogic::GetEPTimeout> called with:
  EntryPointName [monitor]
  ResourceType   [WebLogic]
2013/04/16 07:34:09 VCS INFO V-16-55008-20187 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::GetEPTimeout> Using AEPTimeout [60] as EP timeout value
2013/04/16 07:34:09 VCS INFO V-16-55008-20094 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <WebLogic::GetResourceState> called with arguments:
  State                    [2]
  iState                   [0]
  EntryPointName           [monitor]
  AdminURL           []
  BEA_HOME                 [/XXXX/app/oracle/product/Middleware]
  WlstScript               [/XXXXX/app/oracle/product/Middleware/oracle_common/common/bin/wlst.sh]  
  ListenAddressPort        []
  nmListenAddressPort      [XXX.YYY.com.kw:5556]
  ServerName               []
  ServerRole               [NodeManager]
  DomainName               [<undef>]
  DomainDir                []
  nmType                   [ssl]
  WLSUser                  [XXX]
  WLSPassword              [XXXXXXXXXXXXXXXXX]
  nmUser                   [XXXXXXX]
  nmPassword               [XXXXXXXXXXXXXXXXX]
  User                     [XXXXXXXXXXX]
  SecondLevelMonitor       [0]
  MonitorProgramAndArgs    []
  ExpirationTime           [1366086906]
  level_one                [1]
  level_two                [0]
2013/04/16 07:34:09 VCS INFO V-16-55008-20112 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <WebLogic::GetWLSState> called with:
  BEA_HOME                 [/XXX/app/oracle/product/Middleware]
  WlstScript               [/XXX/app/oracle/product/Middleware/oracle_common/common/bin/wlst.sh]
  EntryPointName           [monitor]
  AdminURL                 []
  User                     [XXXXXX]
  WLSUser                  [XXXXXXXXXXX]
  WLSPassword              [XXXXXXXXXXXXXXXXX]
  ServerName               []
  ServerRole               [NodeManager]
  DomainName               [<undef>]
  DomainDir                []
  ListenAddressPort        []
  nmListenAddressPort      [XXXXXX.XXX.com.kw:5556]
  ExpirationTime           [1366086906]
  MonitorProgram           []
  State                    [2]
  IState                   [0]
2013/04/16 07:34:10 VCS INFO V-16-55008-20159 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <WebLogic::ExtractIP> called with: HostPort [XXX.XXX.com.kw:5556]
2013/04/16 07:34:10 VCS INFO V-16-55008-20172 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::ExtractIP>:Opened IP file [/tmp/.VRTSWebLogic/XXX_WL_NM.ip] successfully for read/write
2013/04/16 07:34:10 VCS INFO V-16-55008-20176 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <WebLogic::CheckIfIPExistsLocally> called with: IP [xx.xx.xx.xxx]
2013/04/16 07:34:10 VCS INFO V-16-55000-10138 Resource(XXX_WL_NM) - (hot441v3:monitor) Arch:GetOSType:Subroutine Arch::GetOSType is called
2013/04/16 07:34:10 VCS INFO V-16-55008-20177 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::CheckIfIPExistsLocally>:IP [xx.xx.xx.xxx] is plumbed.
2013/04/16 07:34:10 VCS INFO V-16-55000-10001 Resource(XXX_WL_NM) - (hot441v3:monitor) Comms:CheckPort:Subroutine <Comms::CheckPort> called with:
  Host        [xx.xx.xx.xxx]
  Port        [5556]
2013/04/16 07:34:10 VCS INFO V-16-55000-10003 Resource(XXX_WL_NM) - (hot441v3:monitor) Comms:CheckPort:Checking Host:[xx.xx.xx.xxx] on Port:[5556], TimeoutSeconds:[<undef>]
2013/04/16 07:34:10 VCS INFO V-16-55000-10004 Resource(XXX_WL_NM) - (hot441v3:monitor) Comms:CheckPort:Checking Host:[xx.xx.xx.xxx] on Port:[5556], TimeoutSeconds:[<undef>] (Connected OK)
2013/04/16 07:34:10 VCS INFO V-16-55008-20123 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::GetWLSState>:Opening PidFile [/tmp/.VRTSWebLogic/XXX_WL_NM.pid]
2013/04/16 07:34:10 VCS INFO V-16-55008-20124 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::GetWLSState>:Got pid [6150] from PidFile [/tmp/.VRTSWebLogic/XXX_WL_NM.pid]
2013/04/16 07:34:10 VCS INFO V-16-55008-20143 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <WebLogic::CheckWLSProcessID> called with
  BEA_HOME                 [/XXX/app/oracle/product/Middleware]
  WlstScript                  [/XXX/app/oracle/product/Middleware/oracle_common/common/bin/wlst.sh]
  EntryPointName           [monitor]
  AdminURL                 []
  DomainName               [<undef>]
  DomainDir                []
  ServerName               []
  ServerRole               [NodeManager]
  Host                     [<undef>]
  Port                     [<undef>]
  nmHost                   [XXX.XXX.com.kw]
  nmPort                   [5556]
  State                    [2]
  IState                   [0]
  PID                      [6150]
2013/04/16 07:34:10 VCS INFO V-16-55000-10138 Resource(XXX_WL_NM) - (hot441v3:monitor) Arch:GetOSType:Subroutine Arch::GetOSType is called
2013/04/16 07:34:10 VCS INFO V-16-55008-20144 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::CheckWLSProcessID>:Filter set to [java.*ListenAddress=XXX.XXX.com.kw\b.*ListenPort=5556\b]
2013/04/16 07:34:10 VCS INFO V-16-55000-10189 Resource(XXX_WL_NM) - (hot441v3:monitor) Sys:RunWithEnvCmdWithOutputWithTimeOut:Subroutine <Sys::RunWithEnvCmdWithOutputWithTimeOut> called with:
  EnvFile       []    
  Command       [/usr/ucb/ps]
  Arguments     [-ww 6150]
  User          [root]
  Timeout       [5]
  OutFile       [<undef>]
  FromDir       [<undef>]
2013/04/16 07:34:10 VCS INFO V-16-55000-10191 Resource(XXX_WL_NM) - (hot441v3:monitor) Sys:RunWithEnvCmdWithOutputWithTimeOut:Environment file not set
2013/04/16 07:34:10 VCS INFO V-16-55000-10199 Resource(XXX_WL_NM) - (hot441v3:monitor) Sys:RunWithEnvCmdWithOutputWithTimeOut:Going to run command line [/usr/ucb/ps -ww 6150], as User [root]
2013/04/16 07:34:10 VCS INFO V-16-55000-10209 Resource(XXX_WL_NM) - (hot441v3:monitor) Sys:RunWithEnvCmdWithOutputWithTimeOut:Command line [/usr/ucb/ps -ww 6150] provided a non-zero exit code -- this does not necessarily indicate a problem... (Perl's OS error variable prior to the command-pipe close was [], and after the close was [] )
2013/04/16 07:34:10 VCS INFO V-16-55000-10289 Resource(XXX_WL_NM) - (hot441v3:monitor) VCSagentFW:messageEngineLog:[/usr/ucb/ps: illegal option -- w
/usr/ucb/ps: illegal option -- w
usage: ps [ -aAdefHlcjLPyZ ] [ -o format ] [ -t termlist ]
    [ -u userlist ] [ -U userlist ] [ -G grouplist ]
    [ -p proclist ] [ -g pgrplist ] [ -s sidlist ] [ -z zonelist ] [-h lgrplist]
  'format' is one or more of:
    user ruser group rgroup uid ruid gid rgid pid ppid pgid sid taskid ctid
    pri opri pcpu pmem vsz rss osz nice class time etime stime zone zoneid
    f s c lwp nlwp psr tty addr wchan fname comm args projid project pset lgrp
]
2013/04/16 07:34:10 VCS INFO V-16-55008-20145 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::CheckWLSProcessID>:command [/usr/ucb/ps] with args [-ww 6150] failed with exit code [1]
2013/04/16 07:34:10 VCS INFO V-16-55008-20133 Resource(XXX_WL_NM) - (hot441v3:monitor) Subroutine <WebLogic::IsWLSProcessRunning> called with
  BEA_HOME                 [/XXX/app/oracle/product/Middleware]
  WlstScript                  [/XXX/app/oracle/product/Middleware/oracle_common/common/bin/wlst.sh]
  EntryPointName           [monitor]
  AdminURL                 []
  DomainName               [<undef>]
  DomainDir                []
  ServerName               []
  ServerRole               [NodeManager]
  Host                     [<undef>]
  Port                     [<undef>]
  nmHost                   [XXX.XXX.com.kw]
  nmPort                   [5556]
  State                    [2]
  IState                   [0]
2013/04/16 07:34:10 VCS INFO V-16-55000-10040 Resource(XXX_WL_NM) - (hot441v3:monitor) Proc:GetLongProcessListHash:Subroutine <Proc::GetLongProcessListHash> called with:
   rlhrProcs        [ARRAY(0x711df8)]
   EnvFlag          [0]
2013/04/16 07:34:10 VCS INFO V-16-55000-10138 Resource(XXX_WL_NM) - (hot441v3:monitor) Arch:GetOSType:Subroutine Arch::GetOSType is called
2013/04/16 07:34:10 VCS INFO V-16-55000-10042 Resource(XXX_WL_NM) - (hot441v3:monitor) Proc:GetLongProcessListHash:Calling ps command [/usr/ucb/ps] with options [axwwl]
2013/04/16 07:34:10 VCS INFO V-16-55000-10046 Resource(XXX_WL_NM) - (hot441v3:monitor) Proc:GetLongProcessListHash:Got [142] processes
2013/04/16 07:34:10 VCS INFO V-16-55000-10138 Resource(XXX_WL_NM) - (hot441v3:monitor) Arch:GetOSType:Subroutine Arch::GetOSType is called
2013/04/16 07:34:10 VCS INFO V-16-55000-10047 Resource(XXX_WL_NM) - (hot441v3:monitor) Proc:FilterProcs:Subroutine <Proc::FilterProcs> called with:
  Filter          [java.*ListenAddress=XXX.XXX.com.kw\b.*ListenPort=5556\b]
2013/04/16 07:34:10 VCS INFO V-16-55000-10048 Resource(XXX_WL_NM) - (hot441v3:monitor) Proc:FilterProcs:Matched command [/usr/jdk/instances/jdk1.6.0_34/jre/bin/sparcv9/java -classpath /usr/jdk/instances/jdk1.6.0_34/jre/lib/rt.jar:/usr/jdk/instances/jdk1.6.0_34/jre/lib/i18n.jar:/XXX/app/oracle/product/Middleware/patch_wls1036/profiles/default/sys_manifest_classpath/weblogic_patch.jar:/XXX/app/oracle/product/Middleware/patch_ocp371/profiles/default/sys_manifest_classpath/weblogic_patch.jar:/usr/jdk/instances/jdk1.6.0_34/lib/tools.jar:/XXX/app/oracle/product/Middleware/wlserver_10.3/server/lib/weblogic_sp.jar:/XXX/app/oracle/product/Middleware/wlserver_10.3/server/lib/weblogic.jar:/XXX/app/oracle/product/Middleware/modules/features/weblogic.server.modules_10.3.6.0.jar:/XXX/app/oracle/product/Middleware/wlserver_10.3/server/lib/webservices.jar:/XXX/app/oracle/product/Middleware/modules/org.apache.ant_1.7.1/lib/ant-all.jar:/XXX/app/oracle/product/Middleware/modules/net.sf.antcontrib_1.1.0.0_1-0b2/lib/ant-contrib.jar:/XXX/app/oracle/product/Middleware/oracle_common/modules/oracle.jrf_11.1.1/jrf-wlstman.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/lib/adf-share-mbeans-wlst.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/lib/adfscripting.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/lib/mdswlst.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/auditwlst.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/igfwlsthelp.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/jps-wlst.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/jrf-wlst.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/oamAuthnProvider.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/oamap_help.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/ossoiap.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/ossoiap_help.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/ovdwlsthelp.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/sslconfigwlst.jar:/XXX/app/oracle/product/Middleware/oracle_common/common/wlst/resources/wsm-wlst.jar:/XXX/app/oracle/product/Middleware/utils/config/10.3/config-launch.jar:/XXX/app/oracle/product/Middleware/wlserver_10.3/common/derby/lib/derbynet.jar:/XXX/app/oracle/product/Middleware/wlserver_10.3/common/derby/lib/derbyclient.jar:/XXX/app/oracle/product/Middleware/wlserver_10.3/common/derby/lib/derbytools.jar -DListenAddress=XXX.XXX.com.kw -DNodeManagerHome=/XXX/app/oracle/product/Middleware/wlserver_10.3/common/nodemanager -DQuitEnabled=true -DListenPort=5556 weblogic.NodeManager -v]
2013/04/16 07:34:10 VCS INFO V-16-55000-10049 Resource(XXX_WL_NM) - (hot441v3:monitor) Proc:FilterProcs:Got [1] matches
2013/04/16 07:34:10 VCS INFO V-16-55008-20139 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::IsWLSProcessRunning>:Matches for processes with filter [java.*ListenAddress=XXX.XXX.com.kw\b.*ListenPort=5556\b] got [1] matches
2013/04/16 07:34:10 VCS INFO V-16-55008-20141 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::IsWLSProcessRunning>:Pushing pid [6150] for this WebLogic Server.
2013/04/16 07:34:10 VCS INFO V-16-55008-20142 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::IsWLSProcessRunning>:Found one or more match in the Process table for this WebLogic Server.
2013/04/16 07:34:10 VCS WARNING V-16-55008-20146 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::CheckWLSProcessID>:Pidfile [/tmp/.VRTSWebLogic/XXX_WL_NM.pid] is stale.
2013/04/16 07:34:10 VCS NOTICE V-16-55008-20148 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::CheckWLSProcessID>: Updated PidFile [/tmp/.VRTSWebLogic/XXX_WL_NM.pid] with [6150].
2013/04/16 07:34:10 VCS INFO V-16-55008-20127 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::GetWLSState>:Server process [6150] found running. PidFile was stale.
2013/04/16 07:34:10 VCS INFO V-16-55008-20132 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::GetWLSState>:Server IP [xx.xx.xx.xxx] is listening on [5556]
2013/04/16 07:34:10 VCS INFO V-16-55008-20096 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::GetResourceState>:Found WLS processes running for NodeManager
2013/04/16 07:34:10 VCS INFO V-16-55008-20111 Resource(XXX_WL_NM) - (hot441v3:monitor) <WebLogic::GetResourceState>:Returning [110]
2013/04/16 07:34:10 VCS INFO V-16-55000-10272 Resource(XXX_WL_NM) - (hot441v3:monitor) VCSagentFW:entryPointExit:Exiting entry point [monitor] with exit code [110]
 

SVGA's picture

Problem Solved .

 

the issue was with  /usr/ucb/ solaris 11 because it is End of Features (EOF) Planned for Future Releases of Oracle Solaris

so i had to use /usr/bin/ps which does not allow -ww option to invoke the same option For the ps command, you can invoke the BSD behavior by not using a leading hyphen (-) with any of the options like "ps ww"

to solve the problem for now i found that oracle still has a pkg called compatibility/ucb which installed /usr/ucb and ucblib/   which solved the issue for the moment tell Oracle removes it or Symantec to create a modified script for the weblogic agent using /usr/bin/ps rather than /usr/ucb/ps

 

thanks Satish K. Pagare for the TRACE hint without i would never found this.

 

 

SOLUTION
Satish K. Pagare's picture

 

The agent uses /usr/ucb/ps for monitoring function. On Solaris 11, the agent requires the pkg:/compatibility/ucb package to be available on the system. I verified it as:
# pkg list compatibility/ucb
NAME (PUBLISHER)                                  VERSION                    IFO
compatibility/ucb                                 0.5.11-0.175.1.0.0.24.2    i--
 
Could you also confirm the exact version of the OS? (uname -a and the contents of /etc/release)