Video Screencast Help

Netbackup 7.5.0.4 on RHEL 6.1 keep on crashing

Created: 08 Feb 2013 • Updated: 07 May 2013 | 8 comments
This issue has been solved. See solution.

Hi All, 

We just migrate our Netbackup master server to version 7.5.0.4 running on RHEL 6.1 the configuration also consiss of Veritas GCO Cluster and VVR. The server is running on VMware 5.0. 

Nowaday the system  keeps on crashing/rebooting sometimes once a day somtimes stable for 2 days and crash again, especially when lots of backup running. Currently I also open a case to RedHat, but anyone has experience on this? Is there any parameter need to change?

 

Regards, 

 

Iwan 

Comments 8 CommentsJump to latest comment

Nagalla's picture

hi 

please provide the logs from.

/usr/openv/db/log/server.log

/var/log/messages

 

 

Dyneshia's picture

Depending on the erros and symptons you are seeing, it could be a space issue or a tunning issue.

1) Space - Make sure you have at 10% free space where netbackup is isntalled at.   7% the bare min.  Once it goes below this threshold Netbackup will stop serivces to avoide data corruption.

2) Tuning -   Reivew these tech notes :

 

http://www.symantec.com/docs/DOC4483  -
Symantec NetBackup™ 7.0 - 7.1 Backup Planning and Performance Tuning Guide

http://www.symantec.com/docs/TECH28934

3rd PARTY RELATED ISSUE: Linux kernel tuning recommendations for NetBackup (tm)

 

/etc/xinetd.conf
       cps             = 50 10
       instances       = 50
       per_source      = 10

hese values have been adjusted in other cases and could limit the number of connections --for example to 50 every 10 seconds and to 50 instances of each service per IP address. Additional information on these parameters are available at: http://linux.die.net/man/5/xinetd.conf
You can increase the number of connections or unlimit per your System Admin.

kernel parameters - sbin/sysctl -a

kernel.sem = 300 32000 32 1024
kernel.msgmnb = 65536
kernel.msgmni = 256
kernel.msgmax = 65536
kernel.shmmni = 4096
kernel.shmall = 4294967296
kernel.shmmax = 68719476736

ulimit -a

ulimit tuning
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1056767
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 8192
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Yasuhisa Ishikawa's picture

Have you already get primary cause of reboot? Why your server crashed?
Panic? HW watchdog? Any statistics of system activity and memory usage?

NetBackup itself has no kernel module on RHEL, so I think it is hard for NetBackup to cause panic, stale or so. If reboot is caused by panic, you should collect kernel dump and send it RedHat. They may find system killer.

If you enabled HW watchdog, please try to disable it. watchdog sometimes reset the system while the system is running well.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

Iwan Tamimi's picture

Hi All thanks alot for the reply, I will try to look it one by one. I am so sorry for the very late reply because of the migration activity. Currently the system is a bit stabilized but could happen again soon. At the same time I also log to RedHat.

I will let you know the outcome. Thank you Again

 

iwan

Marianne's picture

Please have a look at this TN that was published a couple of days ago:

http://www.symantec.com/docs/TECH202840

Key performance considerations for NetBackup 7.5 master servers

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Iwan Tamimi's picture

Hi All,

 

Just want to update it.

After we brought up to RedHat, they say something wrong about the

VxVM/VxDMP so we brought up to Symantec. Symantec sugest to do this (this solution may be really a site specific:)

 

1) Yes, patch installation is not required.

What you need to do is add below lines to the vxattachd script.

========================

Update /usr/lib/vxvm/bin/vxattachd with below.

323 # === Add below check to ignore LVM disks ===

324

325 lvm=`$VXDISK list | grep $dmpnode | awk '{print $5}'`

326 if [ "$lvm" = "LVM" ]

327 then

328 continue

329 fi

Then restart vxattachd by restart vxvm-recover # /etc/init.d/vxvm-recover

restart ========================

 

No crash happen anymore so far(12 days uptime, usually every 5 days or less the system crash)

SOLUTION
Yasuhisa Ishikawa's picture

Thank you for sharing. 

BTW, can you post any clue like kernal backtrace by which we can identify this issue?

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan