Netbackup 7.5.0.4 on RHEL 6.1 keep on crashing
Created: 08 Feb 2013 | Updated: 07 May 2013 | 8 comments
This issue has been solved. See solution.
Hi All,
We just migrate our Netbackup master server to version 7.5.0.4 running on RHEL 6.1 the configuration also consiss of Veritas GCO Cluster and VVR. The server is running on VMware 5.0.
Nowaday the system keeps on crashing/rebooting sometimes once a day somtimes stable for 2 days and crash again, especially when lots of backup running. Currently I also open a case to RedHat, but anyone has experience on this? Is there any parameter need to change?
Regards,
Iwan
Discussion Filed Under:
Comments 8 Comments • Jump to latest comment
hi
please provide the logs from.
/usr/openv/db/log/server.log
/var/log/messages
Depending on the erros and symptons you are seeing, it could be a space issue or a tunning issue.
1) Space - Make sure you have at 10% free space where netbackup is isntalled at. 7% the bare min. Once it goes below this threshold Netbackup will stop serivces to avoide data corruption.
2) Tuning - Reivew these tech notes :
http://www.symantec.com/docs/DOC4483 -
Symantec NetBackup™ 7.0 - 7.1 Backup Planning and Performance Tuning Guide
http://www.symantec.com/docs/TECH28934
3rd PARTY RELATED ISSUE: Linux kernel tuning recommendations for NetBackup (tm)
/etc/xinetd.conf
cps = 50 10
instances = 50
per_source = 10
hese values have been adjusted in other cases and could limit the number of connections --for example to 50 every 10 seconds and to 50 instances of each service per IP address. Additional information on these parameters are available at: http://linux.die.net/man/5/xinetd.conf
You can increase the number of connections or unlimit per your System Admin.
kernel parameters - sbin/sysctl -a
kernel.sem = 300 32000 32 1024
kernel.msgmnb = 65536
kernel.msgmni = 256
kernel.msgmax = 65536
kernel.shmmni = 4096
kernel.shmall = 4294967296
kernel.shmmax = 68719476736
ulimit -a
ulimit tuning
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1056767
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Have you already get primary cause of reboot? Why your server crashed?
Panic? HW watchdog? Any statistics of system activity and memory usage?
NetBackup itself has no kernel module on RHEL, so I think it is hard for NetBackup to cause panic, stale or so. If reboot is caused by panic, you should collect kernel dump and send it RedHat. They may find system killer.
If you enabled HW watchdog, please try to disable it. watchdog sometimes reset the system while the system is running well.
Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan
Two similar posts:
https://www-secure.symantec.com/connect/forums/netbackup-master-server-crashing-frequently-vcs-error-v-16-1-13067-thread4153408416-agent-calv
https://www-secure.symantec.com/connect/forums/unexpected-nebackup-restart-cluster-environment
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Hi All thanks alot for the reply, I will try to look it one by one. I am so sorry for the very late reply because of the migration activity. Currently the system is a bit stabilized but could happen again soon. At the same time I also log to RedHat.
I will let you know the outcome. Thank you Again
iwan
Please have a look at this TN that was published a couple of days ago:
http://www.symantec.com/docs/TECH202840
Key performance considerations for NetBackup 7.5 master servers
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Hi All,
Just want to update it.
After we brought up to RedHat, they say something wrong about the
VxVM/VxDMP so we brought up to Symantec. Symantec sugest to do this (this solution may be really a site specific:)
1) Yes, patch installation is not required.
What you need to do is add below lines to the vxattachd script.
========================
Update /usr/lib/vxvm/bin/vxattachd with below.
323 # === Add below check to ignore LVM disks ===
324
325 lvm=`$VXDISK list | grep $dmpnode | awk '{print $5}'`
326 if [ "$lvm" = "LVM" ]
327 then
328 continue
329 fi
Then restart vxattachd by restart vxvm-recover # /etc/init.d/vxvm-recover
restart ========================
No crash happen anymore so far(12 days uptime, usually every 5 days or less the system crash)
Thank you for sharing.
BTW, can you post any clue like kernal backtrace by which we can identify this issue?
Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan
Would you like to reply?
Login or Register to post your comment.