DOCUMENTATION: Best practices when configuring a large number of files or wildcards in policy file lists

Article:TECH72478  |  Created: 2009-01-17  |  Updated: 2013-05-12  |  Article URL http://www.symantec.com/docs/TECH72478
Article Type
Technical Solution

Product(s)

Issue



DOCUMENTATION: Best practices when configuring a large number of files or wildcards in policy file lists

Manual:
 
 Veritas NetBackup(tm) 6.5 Backup Planning and Performance Tuning Guide
 Veritas NetBackup(tm) Administrator's Guide, Volume 1 for Unix version 6.0
 Veritas NetBackup(tm) Administrator's Guide, Volume 1 for Unix version 6.5
 Symantec NetBackup(tm) Backup Planning and Performance Tuning Guide Release 7.0 through 7.1
 Symantec NetBackup(tm) Administrator's Guide, Volume 1 for Unix version 7.0
 Symantec NetBackup(tm) Administrator's Guide, Volume 1 for Unix version 7.1
 Symantec NetBackup(tm) Administrator's Guide, Volume 1 for Unix version 7.5
 
Modification Type:  Addition

Modification:    
NetBackup Administrators should use caution when configuring policies with either extremely large file lists or wildcards in the file list as this may trigger thousands of backup jobs to be submitted to the NetBackup scheduler (nbpem).  

 

Error



Failure to properly configure policies properly may result in the following behavior when policies are scheduled:
 

  • Extreme delays in backup jobs going queued and from queued to active.  There are several reasons for this - the two most prevalent are:
  • The nbpem process must query the database, via nbproxy, to get the last backup time for each stream.  On a busy system, this can be a time consuming process.
  • Backup Id's are assigned using the system ctime; only one backup id per second can be assigned as the requirement is to have a unique backup id.
  • The NetBackup Job Manager (nbjm) process memory utilization may increase significantly as the jobs enter the nbjm cache, waiting for nbjm processing time.
  • Possible core dumps if the number of jobs submitted is large enough to utilize all available memory.
  • Excessive tape mounts and dismounts.
  • Large delays in resource allocation for all backup jobs.
  • Backups that can't start or run to completion due to the resource consumption.

Other problems that may be seen as the number of NetBackup and EMM catalog entries grows over time are:
 
  • Catalog backups will require more space on backup media and the backups will take longer.
  • Housekeeping activities such as media and image cleanup will take longer.
  • The time it takes to return data when running queries for restores or reports will take substantially longer.

 

Environment



The NetBackup 7.0 through 7.1  Backup Planning and Performance Tuning Guide should be reviewed to assist with capacity planning and configuration guidelines.  Sections of interest for policy configuration and capacity planning are:
 

  • Recommendations for sizing the catalog, starting on page 30.
  • It is not recommended to exceed 2,000,000 catalog entries, page 31.
  • Do not plan to run more than about 20,000 jobs per 12 our period on a single master server, page 40.
  • Factors that limit job scheduling, starting on page 48.
  • Performance guidelines for NetBackup policies, page 63.

Solution



Additional best practices when using wildcards or extremely large file lists:
 
  • If possible, do not select "Allow Multiple Data Streams" in the policy attributes.
  • Consider using NEW_STREAM directives to limit the number of streams generated.

 
Review the following summary for an example of the load nbpem will put on nbproxy during processing of policies configured with Allow Multiple Data Streams and a wildcard in a path specified in the Backup Selection List.
 
The files and directories become individual entries in the catalog STREAMS files and each must be checked for the last backup information.
 
This requires nbpem to submit a getlastbackup request to nbproxy. Note the effect in the number of getlastbackup calls and the time delay in jobs queuing.
 

Test Configuration:
Identical data in each path, for all the policies
 
60 policies with allow multiple data streams, no wild card, unique path in each policy;
 
60 policies with allow multiple data streams, use wild card, unique path in each policy;
 

 
No wildcard in the path:
 
bppllist slp_test_51 | egrep -i "^include|^sched "
 
INCLUDE /hobo/data                                                # Each policy actually has its own directory of data; this path is replaced with a unique path for each policy
 
SCHED slp_full_odd 0 1 3600 0 0 0 0 *NULL* 0 0 0 0 0 0 -1 0
 
SCHED slp_incr_1h_odd 1 8 3600 0 0 0 0 *NULL* 0 0 0 0 0 0 -1 0
 
SCHED slp_incr_d_odd 1 8 64800 0 0 0 0 *NULL* 0 0 0 0 0 0 -1 0
 

 
Wildcard in the path:
 
bppllist slp_test_61 | egrep -i "^include|^sched "
 
INCLUDE /hobo/data/*                                            # Each policy actually has its own directory of data; this path is replaced with a unique path for each policy
 
SCHED slp_full_odd 0 1 3600 0 0 0 0 *NULL* 0 0 0 0 0 0 -1 0
 
SCHED slp_incr_1h_odd 1 8 3600 0 0 0 0 *NULL* 0 0 0 0 0 0 -1 0
 
SCHED slp_incr_h2_odd 1 8 64800 0 0 0 0 *NULL* 0 0 0 0 0 0 -1 0
 

 
Summary:                                                                                                               getlastbackup calls
 
full no wildcard: all jobs queued within a few seconds                                                                  122    
 
incremental no wildcard: all jobs queued within a few seconds                                                     366
 
full with wildcard; jobs continued queuing for 5 minutes and three seconds.                                 2122
 
incremental with wildcard: jobs continued queuing for 61 minutes                                              26535  
 

 
Not using a wildcard in the path allowed nbpem <==> nbproxy (getlastbackup) calls and subsequent processing to complete in seconds.
 
For both full and incremental schedules, all 60 jobs queued in seconds and as resources became available the jobs went active.
 

 
Wildcard testing:
 
  • 60 policies with allow multiple data streams, use wild card, unique path in each policy;

 
Execute the full backup schedule for each policy
 
  • Once the full backup jobs completed, the catalog contained 5644 entries (headers and files.file) for just this single client.
  • 2min ~43 seconds to process the 2221 getlastbackup calls
  • 2221 jobs for 60 policies and 72 streams per policy took 5 minutes 3 seconds to get into a queued state.

 
And now the incremental; and this is where the load becomes significant.
 
The streams file is 216 lines long; 72 paths for each schedule.
 
An incremental backup results in a getlastbackup call for each schedule and each path in the STREAMS file. Refer below to the excerpt from the nbproxy log.
 

 
For each path found, each schedule references the path: the first field is the stream number, the second field in each line is the lastbackuptime, the fourth field is the schedule name, the last field is the path.
 
STREAMS file:
 
grep "^72 " ../../streams/STREAMSslp_test_179                 # note the nbproxy entries referencing this policy and stream number
 
72 1269999230 slp_test_179 slp_full_odd 0,0,3600 0 0 /hobo/data118/unload.log
 
72 1270052345 slp_test_179 slp_incr_1h_odd 1,0,3600 0 0 /hobo/data118/unload.log
 
72 0 slp_test_179 slp_incr_h2_odd 1,0,64800 0 0 /hobo/data118/unload.log
 

 
26535 getlastbackup calls;   took 61 minutes to queue all the jobs as each combination of schedule and stream is checked for getlastbackup.
 

 
nbproxy log:
 

 
07:36:46.352 [8186] <2> dblibFacadeImpl::getLastBackup_v2(): call get_last_backup clientName sprst5120b4-09.spr.spt.symantec.com policyName slp_test_122 scheduleName slp_full_odd scheduleType 0 streamNumber -1 excludePlaceHolderImages no(dblibFacadeImpl.cpp:5201)
 
...
 
Note the different schedule name and the same stream number in these next three lines.
 
08:35:18.316 [8186] <2> dblibFacadeImpl::getLastBackup_v2(): call get_last_backup clientName sprst5120b4-09.spr.spt.symantec.com policyName slp_test_179 scheduleName slp_full_odd scheduleType 0 streamNumber 72 excludePlaceHolderImages no(dblibFacadeImpl.cpp:5201)
 
08:35:18.343 [8186] <2> dblibFacadeImpl::getLastBackup_v2(): call get_last_backup clientName sprst5120b4-09.spr.spt.symantec.com policyName slp_test_179 scheduleName slp_incr_1h_odd scheduleType 1 streamNumber 72 excludePlaceHolderImages no(dblibFacadeImpl.cpp:5201)
 
08:35:18.386 [8186] <2> dblibFacadeImpl::getLastBackup_v2(): call get_last_backup clientName sprst5120b4-09.spr.spt.symantec.com policyName slp_test_179 scheduleName slp_incr_h2_odd scheduleType 1 streamNumber 72 excludePlaceHolderImages no(dblibFacadeImpl.cpp:5201)
 

 

 




Legacy ID



327935


Article URL http://www.symantec.com/docs/TECH72478


Terms of use for this information are found in Legal Notices