Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Netbackup master server crashing frequently. VCS ERROR V-16-1-13067 Thread(4153408416) Agent is calling clean for resource(nbu_server) because the resource became OFFLINE unexpectedly, on its own.

Created: 11 Feb 2013 | 6 comments
Tanmoy's picture

Netbackup master server crashing frequently. Please find the errors logs in messages log. Help required to troubleshoot the root cause and to fix please.

What are these SQLAnywhere and SA_cache.dmp stuffs...Please help :(
 
Messages log:
Feb  8 18:26:52 NB-Master1 SQLAnywhere(NB-Master): Finished checkpoint of "NBDB" (NBDB.db) at Fri Feb 08 2013 18:26
Feb  8 18:28:00 NB-Master1 SQLAnywhere(NB-Master): Starting checkpoint of "NBDB" (NBDB.db) at Fri Feb 08 2013 18:28
Feb  8 18:28:00 NB-Master1 SQLAnywhere(NB-Master): Finished checkpoint of "NBDB" (NBDB.db) at Fri Feb 08 2013 18:28
Feb  8 18:28:51 NB-Master1 SQLAnywhere(NB-Master): Starting checkpoint of "NBDB" (NBDB.db) at Fri Feb 08 2013 18:28
Feb  8 18:28:52 NB-Master1 SQLAnywhere(NB-Master): Finished checkpoint of "NBDB" (NBDB.db) at Fri Feb 08 2013 18:28
Feb  8 18:29:18 NB-Master1 SQLAnywhere(NB-Master): Cache size adjusted to 510844K 
Feb  8 18:29:21 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:23 NB-Master1 SQLAnywhere(NB-Master): Completed dumping cache info
Feb  8 18:29:23 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:24 NB-Master1 SQLAnywhere(NB-Master): Completed dumping cache info
Feb  8 18:29:24 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:26 NB-Master1 SQLAnywhere(NB-Master): Completed dumping cache info
Feb  8 18:29:26 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:28 NB-Master1 SQLAnywhere(NB-Master): Completed dumping cache info
Feb  8 18:29:28 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:29 NB-Master1 SQLAnywhere(NB-Master): Completed dumping cache info
Feb  8 18:29:29 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:31 NB-Master1 SQLAnywhere(NB-Master): Completed dumping cache info
Feb  8 18:29:31 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:33 NB-Master1 SQLAnywhere(NB-Master): Completed dumping cache info
Feb  8 18:29:33 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:34 NB-Master1 SQLAnywhere(NB-Master): Completed dumping cache info
Feb  8 18:29:34 NB-Master1 SQLAnywhere(NB-Master): Starting dumping cache info to 'SA_cache.dmp.20638' ... 
Feb  8 18:29:34 NB-Master1 SQLAnywhere(NB-Master): Fatal error:  no free pages available in cache
Feb  8 18:30:01 NB-Master1 AgentFramework[26075]: VCS ERROR V-16-1-13067 Thread(4153408416) Agent is calling clean for resource(nbu_server) because the resource became OFFLINE unexpectedly, on its own. 
Feb  8 18:30:01 NB-Master1 Had[25478]: VCS ERROR V-16-1-13067 (NB-Master1) Agent is calling clean for resource(nbu_server) because the resource became OFFLINE unexpectedly, on its own. 
Feb  8 18:32:28 NB-Master1 tldd[23302]: Daemon has terminated due to signal (15)
Feb  8 18:32:28 NB-Master1 ltid[21234]: LTID terminating because it received a signal (15)
Feb  8 18:32:28 NB-Master1 vmd[21240]: volume daemon terminating because it received a signal (15)
Feb  8 18:32:28 NB-Master1 vmd[21240]: terminating - daemon terminated (7)
Feb  8 18:32:28 NB-Master1 avrd[23575]: Daemon has terminated due to signal (15)
Feb  8 18:32:28 NB-Master1 tldcd[23582]: Daemon has terminated due to signal (15)
Feb  8 18:35:02 NB-Master1 AgentFramework[26075]: VCS ERROR V-16-1-13006 Thread(4155911072) Resource(nbu_server): clean procedure did not complete within the expected time. 
Feb  8 18:35:02 NB-Master1 Had[25478]: VCS ERROR V-16-1-13006 (NB-Master1) Resource(nbu_server): clean procedure did not complete within the expected time. 
Feb  8 18:37:30 NB-Master1 ltid[21234]: Sending shutdown to tldcd daemon...
Feb  8 19:12:59 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 10 invocations of the clean procedure have failed. 
Feb  8 19:49:21 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 20 invocations of the clean procedure have failed. 
Feb  8 20:25:43 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 30 invocations of the clean procedure have failed. 
Feb  8 21:02:04 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 40 invocations of the clean procedure have failed. 
Feb  8 21:38:25 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 50 invocations of the clean procedure have failed. 
Feb  8 22:08:01 NB-Master1 snmpd[24267]: Received SNMP packet(s) from 172.30.216.60 
Feb  8 22:08:40 NB-Master1 sshd(pam_unix)[2094]: check pass; user unknown
Feb  8 22:08:40 NB-Master1 sshd(pam_unix)[2094]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:08:43 NB-Master1 sshd(pam_unix)[2370]: check pass; user unknown
Feb  8 22:08:43 NB-Master1 sshd(pam_unix)[2370]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:08:45 NB-Master1 sshd(pam_unix)[2372]: check pass; user unknown
Feb  8 22:08:45 NB-Master1 sshd(pam_unix)[2372]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:08:48 NB-Master1 sshd(pam_unix)[2374]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local  user=root
Feb  8 22:08:50 NB-Master1 sshd(pam_unix)[2376]: check pass; user unknown
Feb  8 22:08:50 NB-Master1 sshd(pam_unix)[2376]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:08:53 NB-Master1 sshd(pam_unix)[2577]: check pass; user unknown
Feb  8 22:08:53 NB-Master1 sshd(pam_unix)[2577]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:08:55 NB-Master1 sshd(pam_unix)[2579]: check pass; user unknown
Feb  8 22:08:55 NB-Master1 sshd(pam_unix)[2579]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:08:57 NB-Master1 sshd(pam_unix)[2665]: check pass; user unknown
Feb  8 22:08:57 NB-Master1 sshd(pam_unix)[2665]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:09:00 NB-Master1 sshd(pam_unix)[2788]: check pass; user unknown
Feb  8 22:09:00 NB-Master1 sshd(pam_unix)[2788]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:09:02 NB-Master1 sshd(pam_unix)[2810]: check pass; user unknown
Feb  8 22:09:02 NB-Master1 sshd(pam_unix)[2810]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=gts-ins-mgr01.backup.local 
Feb  8 22:14:47 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 60 invocations of the clean procedure have failed. 
Feb  8 22:51:09 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 70 invocations of the clean procedure have failed. 
Feb  8 22:52:10 NB-Master1 kernel: ide-cd: cmd 0x3 timed out
Feb  8 22:52:10 NB-Master1 kernel: hda: irq timeout: status=0xd0 { Busy }
Feb  8 22:52:10 NB-Master1 kernel: hda: irq timeout: error=0x00
Feb  8 22:52:10 NB-Master1 kernel: hda: ATAPI reset complete
Feb  8 23:27:32 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 80 invocations of the clean procedure have failed. 
Feb  9 00:03:54 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 90 invocations of the clean procedure have failed. 
Feb  9 00:33:00 NB-Master1 login(pam_unix)[30840]: authentication failure; logname= uid=0 euid=0 tty= ruser= rhost=  user=user1
Feb  9 00:33:00 NB-Master1 bpjava-msvc[30840]: pam_krb5[30840]: authentication succeeds for 'user1' (user1@UK.abc.LOCAL)
Feb  9 00:36:50 NB-Master1 login(pam_unix)[4288]: authentication failure; logname= uid=0 euid=0 tty= ruser= rhost=  user=user1
Feb  9 00:36:50 NB-Master1 bpjava-msvc[4288]: pam_krb5[4288]: authentication succeeds for 'user1' (user1@UK.EXPERIAN.LOCAL)
Feb  9 00:40:17 NB-Master1 Had[25478]: VCS ERROR V-16-1-13079 (NB-Master1) Resource(nbu_server): The last 100 invocations of the clean procedure have failed. 
 
 

Comments 6 CommentsJump to latest comment

RamNagalla's picture

Feb  8 18:29:34 NB-Master1 SQLAnywhere(NB-Master): Fatal error:  no free pages available in cache

you would need to increase your page memory 

you can check this from dbadm, 

option 2) database space and memory management.

http://www.symantec.com/business/support/index?pag...

Marianne's picture

First of all - tell us more about your clustered master server:

NBU version and patch level
OS on master server nodes
SF/HA version on master server nodes

Your issue seems identical to this post:

https://www-secure.symantec.com/connect/forums/unexpected-nebackup-restart-cluster-environment 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Tanmoy's picture

Thanks Nagala... I have already checked the link you have given... thanks again.. Is thete any to determine how much memory will be sufficiant for me please..?

Thanks Marianne... please find the details below. Do you think this is aissue with VCS not NBU please..? Please let me know If you need any further logs. Thanks again.

NBU version: 7.1.0.2

OS: Red hat Linux 5.1

SF/HA : VCS 

FYI: This environment is very learge one with more than 3,400 Servers (Win/Linux mixed) and getting more than 50/60 TB of Differential backups every night..
RamNagalla's picture

FYI: This environment is very learge one with more than 3,400 Servers (Win/Linux mixed) and getting more than 50/60 TB of Differential backups every night.

I would suggest you to redude the load on the master server, by migrating clinets to another Master server,

Marianne's picture

In addition to excellent TN in Nagalla's post, this WhitePaper contains recommendations for Master server in busy environment:
 https://www-secure.symantec.com/connect/articles/whitepaper-netbackup-architecture-overview

Another doc (about nbrb tuning): http://www.symantec.com/docs/TECH137761

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links