Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

V-16-1-17037, V-16-1-17013 -- Notifier Resources faulted atleast one time in a month

Created: 02 Apr 2013 • Updated: 31 Jul 2013 | 10 comments
Zahid.Haseeb's picture
This issue has been solved. See solution.

Environment

OS = Solaris 10 (64bit)

HA/VCS Cluster(without vxvm) = 6.0

Cluster Nodes = two

 

Problem

Notifier resource faulted and gets online again after a while, atleast one time in a month but not at specific time.

Notifier log

2013/03/08 18:08:45 VCS WARNING V-16-1-17037 Notifier:SMTP unknown error
2013/03/10 01:51:18 VCS WARNING V-16-1-17013 Notifier:IPM connection failed

Engine log
2013/03/04 16:14:49 VCS WARNING V-16-1-17049 Notifier:Message 64 is being deleted since message cannot be delivered for more than a hour

Operating Systems:
Discussion Filed Under:

Comments 10 CommentsJump to latest comment

Daniel Matheus's picture

Hello Zahid,

 

These messages are usually seen under the following conditions:

 

1.    SMTP service is not running on SMTP server
2.    SMTP port is blocked (maybe due to firewall)
3.    SMTP service is running on port other than port 25

 

Can you provide the below details?

 

Is the SMTP server address resolved by a DNS server?

Is the SMTP server using port 25?

Can you telnet to this port? (i.e. smtp.localserver 25)

Was there an ifconfig down or general network disruption on the NIC that is configured in the Notifier resource group?

 

Thanks,
Dan

 

 

 

 

If this post has helped you, please vote or mark as solution

Zahid.Haseeb's picture

Thanks Daniel for your suggestion:

 

1.    SMTP service is not running on SMTP server
2.    SMTP port is blocked (maybe due to firewall)
3.    SMTP service is running on port other than port 25

Offcourse SMTP service is running, SMTP is not blocked, SMTP port is running on port 25

==============================================

Is the SMTP server address resolved by a DNS server?

I used IP Address for SMTP Server , in the attribute of java console

Is the SMTP server using port 25?

yes

Can you telnet to this port? (i.e. smtp.localserver 25)

yes I can telnet (ipaddress 25)

Was there an ifconfig down or general network disruption on the NIC that is configured in the Notifier resource group?

Not able to see such thing under /var/log/messages

=============================================

main.cf

 

group ClusterService (
    SystemList = { XXXXXSEC = 0, XXXXXPRI = 1 }
    AutoStartList = { XXXXXPRI }
    Administrators = { XXXXX }
    Operators = { XXXXX, XXXXX, XXXXX }
    )

    NIC CE0 (
        Enabled = 0
        Device @XXXXXSEC = " ce0"
        )

    NotifierMngr NOTIFIER (
        SmtpServer = "172.XX.X.XX"
        SmtpServerVrfyOff = 1
        SmtpFromPath = "xxxxx@xxxxx.xxxxx.pk"
        SmtpRecipients = { "abc@xxxxx.xxxxx.pk.pk" = Warning }
        )

    NOTIFIER requires CE0

    // resource dependency tree
    //
    //    group ClusterService
    //    {
    //    NotifierMngr NOTIFIER
    //        {
    //        NIC CE0
    //        }
    //    }

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com

arangari's picture

is the resource faulting or message is not sent? from the logs above, it is not obvious that the resource is faulted.

 

Thanks and Warm Regards,

Amit Rangari

If this post helped you resolving the issue, please mark it as solution. _____________________________________________________________________________

Zahid.Haseeb's picture

Sorry arangari

I dig more and come to a point. See the below logs for reference which shows that NOTIFIER resource got offline and did retry(default is three retry) and Agent has successfully restarted

ENGINE LOG

2013/03/14 13:22:42 VCS ERROR V-16-2-13067 (XXXXXSEC) Agent is calling clean for resource(NOTIFIER) because the resource became OFFLINE unexpectedly, on its
 own.
2013/03/14 13:22:42 VCS INFO V-16-2-13068 (XXXXXSEC) Resource(NOTIFIER) - clean completed successfully.
2013/03/14 13:22:42 VCS ERROR V-16-2-13073 (XXXXXSEC) Resource(NOTIFIER) became OFFLINE unexpectedly on its own. Agent is restarting (attempt number 1 of 3)
 the resource.
2013/03/14 13:22:42 VCS NOTICE V-16-2-13076 (XXXXXSEC) Agent has successfully restarted resource(NOTIFIER).
2013/03/14 13:22:42 VCS INFO V-16-1-55031 Resource NOTIFIER in online state received recurring online message on system XXXXXSEC
2013/03/14 14:38:42 VCS ERROR V-16-2-13067 (XXXXXSEC) Agent is calling clean for resource(NOTIFIER) because the resource became OFFLINE unexpectedly, on its
 own.
 

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com

arangari's picture

Zahid,

I think we will need to dig this further. Please rais appropriate support case. 

Thanks and Warm Regards,

Amit Rangari

If this post helped you resolving the issue, please mark it as solution. _____________________________________________________________________________

Zahid.Haseeb's picture

Arangari I created the case two times. Even enabled the DEBUG logs for Notifier manager from the instruction of TSE in second case but nothing found in the logs. and finally case closed as the case could not open more than 10days and this problem happens in around 30 days. Few days ago I created the same case and requested engineer to keep open this case.

Many times my problem resolved on the forum more quick than support by you sort of guys :) ...furthermore I am sending you the Case ID , If you need to see the case history.

 

Thanks

 

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com

stib's picture

This issue is being generated usually when NotifierMngr is unable to connect to Notifier engine port (had daemon is managing this port)

NotifierMngr resources are then existting and creating coredump in /opt/VRTSvcs/bin/NotifierMngr/ (Solaris 10, depends on your install location).

Issue can appear on the systems where had daemon is running for a long time. To resolve the issue, usually 'hastop -local -force; hastart'  can help. Then you should not receive the errors ;)

S.

 

Zahid.Haseeb's picture

Thanks all for kind words. Few days back it was identified that a Vulnerability Scanner runs once in a month which is doing something wrong

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com

SOLUTION
arangari's picture

did you dig this further ? do we know what kind of activities scanners were doing?

Thanks and Warm Regards,

Amit Rangari

If this post helped you resolving the issue, please mark it as solution. _____________________________________________________________________________

Zahid.Haseeb's picture

The responsible guy discussing this with vendor. Will share the result

Any comment will be appreciated. Mark as Solution if your query is resolved
__________________
Thanks in Advance
Zahid Haseeb

zahidhaseeb.wordpress.com