Overview of NBU performance testing.

Article:TECH147296  |  Created: 2010-12-31  |  Updated: 2011-09-23  |  Article URL http://www.symantec.com/docs/TECH147296
Article Type
Technical Solution


Problem



  • Backup is slow.
  • Backup takes too long.
  • Backups do not run because the scheduled backup window closes.

Error



NBU Status codes: 23, 24, 25, 196 and others.

WIndows: 10054, 10053


Environment



UNIX/Linux

Microsoft OS.


Cause



If one host is slow it is usually a non-netbackup cause.  Slow disk, old network drivers, networking infrastructure, bad disk, slow disk, incorrectly configured SAN, overloaded SAN, overloaded NAS.

If all backups of hosts in one or more policies then some NBU tuning parameters can be adjusted to improve performance.


Solution



Cases regarding performance issues are non-trivial. A Technical Support Engineer (TSE) can perform a remote support session, however it will probably not yield the results hoped for:

There is no "go fast" check box.

Without any basic information about your environment (i.e. nbsu) TSEs have no idea about the client, master server, or media server; How they are configured; What, if any, existing parameters are being used for caching/buffering. Furthermore, if there is an issue with NetBackup (NBU), without an nbsu the case cannot be advanced to the next support level.
 

Schematic of a Typical NetBackup Environment
This is a typical NBU environment, however your environment may be one where the master and media servers are combined.

There are multiple locations where there could be a performance bottleneck. The first step is to establish the speed at which the client can read data off of the disk(s) (zone3). If the data resides on NAS2 (fiber drive, NFS, or UNC) (zone 4) the speed must be tested across that network as well. We use a bpbkar to null test to test the speed. See one of the Backup Planning and Performance Tuning Guides which describes how to run this and other tests.

(Links to these documents are available at the bottom of this document.)

If the speed of the 'null' tests on the client(s) is slow then it is the NAS network, or the local disk/file system.  If this is slow there is nothing in NBU that can be changed to make it go faster. You will need to examine the disks, NAS, SAN, or NFS drives to find the performance bottleneck.

Local data can be slow when many thousands of files are in a single folder on the disk. This is a function of the file system and cannot be overcome by using the normal client software. If you have this issue then the solution is to use FlashBackup, which performs a block level backup of the entire disk. (FlashBackup cannot be used with NFS or UNC mounted data.)

Then we can test zone 2 by using a 'null disk storage unit'. Suspend backups and do a test policy backup of the client to a disk storage unit using the 'null write' touch file. Since a 'null write' DSU is, for practical purposes, an infinitely fast drive. This tests the networking in zone 2 between the client and the media server. If the write speed of the 'null dsu' backup is 80% (or more) of the 'null' tests on the client then the problem lies in zone 1 (DSU/TSU). If it is not then it is in zone 2.

If the issue is found to be in zone 2 then there are many areas to check. NIC drivers on the client and media server, TCP configurations on the client/media servers (TCP buffer sizes, DNS, segment size, windowing, etc), Network configuration (ports, speed, firewalls, MSS, etc) and then NBU configuration (SIZE_DATA_BUFFERS, NUWBER_DATA_BUFFERS). Adjusting the NBU buffers can improve speed, however it cannot make up for problems in slow reading of data off of disks, or in poor network performance or for poor performance of storage units.

In some instances, there are problems in zone 1. This is usually due to poor or old drivers for the tape drives or robotic library issues. Tape drive issues include: Firmware issues; Bad tape drives; Bad tapes; Problems with the storage network on the media server side.

The least favorable location for data would be in the drive labeled NAS. If the client mounts data (via NFS or UNC) and then is backed up by the media server, the data must travel over the network twice to be backed up. If the NAS is on the same NIC as the backup network you will expe9rience 40% or less of the speed the network is capable of due to the data stream traveling on the network twice.

NBSU information can provide TSEs with a wealth of useful information very quickly:

  • Operating system (OS) version and build number
  • OS patches or hot fixes
  • OS settings for some TCP parameters
  • NIC driver version
  • NBU version
  • Detect mixed NBU versions on the hosts
  • Snapshot settings for Windows hosts
  • Policy scheduling information (are two jobs running on the same host at the same time?

A Microsoft Product Support Report (MPSR) may also be useful for Windows-based environment.

In addition, specific NetBackup logs may also contain useful information. The bpbkar and bptm/bpdm log files contain messages about buffers from NetBackup (waited for full buffer/waited on empty buffer).The admin logs can tell us if there were system or NBU errors that were slowing down the backup and other logs are for resource issues. The waited for full buffer and the waiting on empty buffer messages are high then adjusting the size data buffers and number data buffers can be adjusted to improve performance over all clients that are backed up by that media server.


Schematic of example NBU environment.



Article URL http://www.symantec.com/docs/TECH147296


Terms of use for this information are found in Legal Notices