Disk staging design considerations
|Article:TECH33710|||||Created: 2005-01-09|||||Updated: 2005-01-09|||||Article URL http://www.symantec.com/docs/TECH33710|
Disk staging design considerations
This document supplements the information found in the VERITAS NetBackup (tm) 5.0 System Administrator's Guide, Volume I - pages 40 - 46.
The following conventions will be used:
The following conventions will be used:
- Client - The source client providing data
- DSSU - A disk staging storage unit
- FDSU - The final destination storage unit (typically tape)
- Stage I - The backup from the client to the disk staging storage unit
- Stage II - The relocation of data from the disk staging storage unit to the final destination storage unit
There are several factors that can affect the performance of the initial stage I backups and the stage II relocations.
The first consideration is the location of the DSSU for the stage I backups:
It is vitally important that DSSUs reside on a dedicated volume. By design, DSSUs will consume all available free space on a volume.
For example: If you have placed a DSSU on the same volume as your NetBackup catalog, when the DSSU reaches 100% capacity there is no free space for NetBackup catalog activity and this can result in a corrupt catalog.
Another example is placing a DSSU on the same volume as a SQL database - as the DSSU consumes all the available free space, your database instances will fail.
If the DSSU is dedicated to one client, it is optimal to install the DSSU on a local volume of that client in a media server configuration (See figure1).
For whichever media server holds the DSSU, the logon account for the NetBackup Client Service must have privileges to delete files from the DSSU volume. A local administrator account is recommended and not the local system account. It is also important that the DSSU data area is excluded from the client backup for this server. The DSSU volume must also be excluded from any VERITAS Snapshot Provider (VSP) or Volume Shadow Copy Service (VSS) open file backup snapshots as the snapshot of DSSU fragments will fail.
If VERITAS Volume manager (tm) is used to manage the DSSU volume, it is important that the volume is not allowed to dynamically grow. Due to the nature of the DSSU processes, the volume will continue to grow until all available disk space is used.
Additional DSSU media servers
If the DSSU is servicing several clients over the network, the DSSU media server should be strategically placed where it can serve the most clients with the least network impact. A DSSU servicing multiple clients should have sufficient network bandwidth available to meet your backup window requirements. In many scenarios, adding an additional media server and an additional DSSU can drastically improve backup performance, network utilisation, and stage II relocation performance (See figure 2).
Use of multiple DSSUs on the same volume
When a DSSU becomes full, bpdm must clear down space 'on the fly' to allow the backup to continue. The default behavior is to clear down the two oldest images that have successfully staged to the FDSU. If you have a mixture of small backups and large backups on the same DSSU, undesired overheads can occur due to the clean up process.
One target DSSU, containing a mixture of full and incremental datasets. The incremental backups are 50MB and the full backups are 500MB (See figure 3).
If a new full backup (500MB) arrives for this DSSU, the cleanup process would have to run at least five times to clear out the ten oldest incremental backups before there is enough disk space for a new 500MB backup.
If a second DSSU is configured for large backups and the first DSSU is used for the smaller backups, the cleanup cycle will only have to run once for a single backup instance to continue (see figure 4).
In this example, if a new incremental backup arrives for DSSU1 when the volume is full, the two oldest incremental backups are removed, freeing up 100MB for the new 50MB backup. If a new full backup arrived for DSSU2 when the volume is full, the two oldest full backups would be removed, freeing up 1000MB for a new 500MB backup, simultaneously freeing up disk space for the incremental backups on DSSU1 as they reside on the same volume.
This basic example shows how we can reduce five cleanup cycles to one for a simple DSSU configuration. If you scale up the size of a single DSSU and the number of mixed backups, this overhead can drastically increase. For example, a 2 TeraByte (TB) DSSU with over 30,000 images residing could require hundreds of cleanup cycles, just to start one new full backup.
Using the multiple DSSU method, the overhead can be reduced to one cleanup cycle per volume per backup instance. The ideal scenario is to have all backups of a similar size being presented to one particular DSSU. The type of backup is not relevant. You could mix backups from e-mail servers, file servers, and database backups on one DSSU if they are all of a similar size.
Caveat: When using multiple DSSUs on the same volume, the overall size of the volume must be sufficient to hold at least two backup images in each DSSU simultaneously and headroom for stage II relocation failures.
The required headroom is dependent on the resilience of the stage II relocation.
For example: If the stage II relocation is to tape and the tape storage unit did not have sufficient media to fulfill the relocations, stage II would fail and it is possible there is not enough potential free space for new backups to the DSSU volume. This headroom figure is entirely dependent on the availability of an administrator to correct stage II failures. As an example, if the administrator is not available over the weekend to monitor the stage II relocation of a full backup run, there must be sufficient headroom to allow these backups to complete. This is in addition to any existing backups that have not yet relocated.
Final destination storage unit location
When designing the disk staging strategy, the physical connectivity to the FDSU should be considered. If the FDSU resides at an off host location, there may not be sufficient time to perform stage II relocations prior to the next stage I cycle.
For example: a DSSU could reside on one media server as a locally attached volume. The FDSU could reside on a remote media server communicating with the DSSU media server via the network. In this example there is an obvious bottleneck over the network.
It is preferable to locate the FDSU local to the DSSU. NetBackup shared storage option can be used with a storage area network (SAN) attached tape library to share tape devices with several DSSU media servers. Having a remote FDSU may be required for various reasons; in this case it is important to have sufficient stage II data movement capacity to complete at least one cycle of stage II relocations prior to the next stage I cycle (See figure 5).
Timing of stage II
When looking at the amount of data traffic the DSSU must handle, the capability of the local disk controller and network controller should be considered. If there are several simultaneous stage I client backups occurring at the same time as stage II relocation processes, the disk controller can be saturated with requests. For this reason, on a busy DSSU, it is recommended to only have the stage II relocation schedule window open when the stage I backup window is closed.
The usage of high performance RAID 5 disk controllers and disk arrays should be considered in preference to single disk volumes for a busy DSSU.
Another common mistake is to treat the stage II schedule in a similar manner as a regular backup schedule. One of the benefits of disk staging is that we can allow the relocation process to run several times throughout the day without affecting production servers.
Not every image may be selected on a run of stage II relocations; stage I backups may still be in progress, or the stage I backups have mixed retention periods and require several different media for stage II. For these reasons it is beneficial to have the stage II schedule cycle several times throughout the day.
For example: The relocation window could be open from 6:00 to 20:00 with a frequency of every two hours. Another consideration is that the stage II processes must communicate with the master server to manage the image headers. As the master server is typically busy during the normal backup window, it is beneficial to only perform stage II when normal backups are not running.
Article URL http://www.symantec.com/docs/TECH33710