Overview of Multiplexing and Multiple Data Streams in NetBackup
|Article:DOC2799|||||Created: 2010-08-25|||||Updated: 2010-09-20|||||Article URL http://www.symantec.com/docs/DOC2799|
Media Multiplexing and Multiple Data Streams are two options that are available in NetBackup to tune the performance times of backups.
The multiplexing option allows writing of multiple data streams to a single tape drive simultaneously, while the Multiple Data Streams option allows the policy file selection to be split into two or more data streams. The client may send the data streams to the media server either simultaneously or sequentially, depending upon multiplexing and maximum jobs per policy (or per client) settings.
Media multiplexing attribute
The Media multiplexing attribute in the policy Schedule attributes screen specifies the maximum number of jobs from the schedule that NetBackup can multiplex onto any one drive. Multiplexing allows concurrent backup jobs from one or more clients to be sent to a single drive and multiplexes the backups onto the media.
Specify a number from 1 through 32, where 1 specifies no multiplexing. Any changes take effect the next time a schedule runs.
Note: Some policy or some schedule types do not support media multiplexing. The option cannot be selected in those instances. Also, multiplexing only applies to tape based backups, not backups being written to disk.
To configure multiplexed backups, multiplexing must be indicated in both the storage unit (Maximum Streams Per Drive setting) and the schedule (Media Multiplexing setting) configuration. Regardless of the Media multiplexing setting, the maximum jobs that NetBackup starts never exceeds the Maximum Streams Per Drive value for the storage unit.
NetBackup multiplexing sends concurrent backups from one or several clients to a single storage device. NetBackup interleaves the data streams as they are written to the media. Multiplexed and un-multiplexed backups can reside on the same volume. Separate volume pools or media IDs are not necessary.
Multiplexing is generally used to reduce the amount of time that is required to complete backups. The performance in the following situations is improved by using multiplexing:
Instances in which NetBackup uses software compression, which normally reduces client performance, are also improved.
Multiple slow networks
The parallel data streams take advantage of whatever network capacity is available.
Many short backups (for example, incremental backups)
In addition to providing parallel data streams, multiplexing reduces the time each job waits for a device to become available. Therefore, the storage device transfer rate is maximized.
No special action is required to restore a multiplexed backup. NetBackup finds the media and restores the requested backup. Restore performance will be impacted due to reading the data from all streams that are multiplexed together.
To reduce the affect of multiplexing on restore times, set the storage unit maximum fragment size to a value smaller than the largest allowed value. Also, enable fast-tape positioning (locate block), if it applies to the tape drives in use (this setting is typically enabled by default).
When NetBackup multiplexes jobs, it continues to add jobs to a drive until the number of jobs on the drive matches either of the following:
· This schedule’s Media Multiplexing setting
If the limit is reached for a drive, NetBackup sends jobs to other drives.
· The storage unit’s Maximum streams per drive setting
NetBackup can add jobs from more than one schedule to a drive.
If multiple schedules are mixed on a given backup drive during multiplexing, each schedule's maximum setting will limit the total number of streams that can be written to tape with that backup. (The lowest multiplex setting among the schedules being written together will be used)
NetBackup attempts to add multiplexed jobs to drives that are already use multiplexing. If multiplexed jobs are confined to specific drives, other drives are available for non-multiplexed jobs.
If the backup window closes before NetBackup can start all the jobs in a multiplexing set, NetBackup completes only the jobs that have started.
Consider the following configuration settings when using multiplexing:
Limit jobs per policy
Set Limit jobs per policy high enough to support the specified level of multiplexing.
Maximum jobs per client
The Maximum Jobs Per Client property limits the number of backup jobs that can run concurrently on any NetBackup client. Maximum Jobs Per Client appears on the Global properties dialog box.
Usually, the client setting does not affect multiplexing. However, consider a case where jobs from different schedules on the same client go to the same storage unit. In this case, the maximum number of jobs that are permitted on the client is reached before the multiplexing limit is reached for the storage unit. When the maximum number of jobs on the client is reached, it prevents NetBackup from fully using the storage unit’s multiplexing capabilities.
Choose a value that is based on the ability of the central processing unit to handle parallel jobs. Because extra buffers are required, memory is also important. If the server cannot perform other tasks or runs out of memory or processes, reduce the Maximum Streams Per Drive setting for the storage unit.
Consider the following items to estimate the potential load that multiplexing can place on the central processing unit:
- The maximum concurrent jobs that NetBackup can attempt equals the sum of the concurrent backup jobs that can run on all storage units.
- The maximum concurrent jobs that can run on a storage unit equals the value of Maximum Streams Per Drive, multiplied by the number of drives.
- The total amount of shared remory required will be (Number of Drives) X (Multiplex Value) X (Number of Data Buffers) X (Size of Data Buffers)
Maximum jobs this client
You can set the maximum number of jobs that are allowed on a specific client without affecting other clients.
The Delay On Multiplexed Restores property applies to multiplexed restores. The property specifies how long the server waits for additional restore requests of files and raw partitions in a set of multiplexed images on the same tape. Delay On Multiplexed Restores appears on the General Server properties dialog box.
Allow multiple data streams attribute
The Allow multiple data streams attribute specifies that NetBackup can divide automatic backups for each client into multiple jobs. The directives, scripts, or templates in the backup selection list specify whether each job can back up only a part of the backup selection list. Since the jobs are in separate data streams, they can occur concurrently.
The directives, scripts, or templates in the backup selection list determine the number of streams (backup jobs) that start for each client. The list also determines how the backup selection list is divided into separate streams.
The following settings determine the number of streams that can run concurrently:
· Number of available storage units
· Multiplexing settings
· Maximum jobs parameters
Multistreamed jobs consist of a parent job to perform stream discovery and children jobs for each stream. In the Activity Monitor, the children jobs display the Job ID of the parent job. Parent jobs display a dash (-) in the Schedule column.
Note: If this attribute is enabled, and a file system is in a client’s exclude list, a NetBackup job appears in the Activity Monitor for the excluded file system. However, no files in the excluded file system are backed up by the job.
When to use multiple data streams
The following items describe the reasons to use multiple data streams:
· To reduce backup time
o Multiple data streams can reduce the backup time for large backups by splitting the backup into multiple streams. Use multiplexing, multiple drives, or a combination of the two to process the streams concurrently. Configure the backup so each device on the client is backed up by a separate data stream that runs concurrently with streams from other devices. For best performance, use only one data stream to back up each physical device on the client. Multiple concurrent streams from a single physical device can adversely affect backup times. The heads must move back and forth between the tracks that contain files for the respective streams.
· To reduce retry time for backup failures
o Because the backup streams run independently, the use of multiple data streams can shorten the retry time in the event of a backup failure. A single failure only terminates a single stream. NetBackup can restart the failed stream without restarting the others.
o For example, assume the backup for a 10-gigabyte partition is split into five streams, each containing 2 gigabytes. If the last stream fails after it writes 1.9 gigabytes (a total of 9.9 gigabytes is backed up), NetBackup retries only the last gigabyte stream. If the 10-gigabyte partition is backed up without multiple data streams and a failure occurs, the entire 10-gigabyte backup must be retried.
o The Schedule backup attempts property in the Global Attributes properties, applies to each stream. For example, if the Schedule backup attempts property is set to 3, NetBackup retries each stream a maximum of three times. The Activity Monitor displays each stream as a separate job. Use the job details view to determine the files that are backed up by each of these jobs.
· To reduce administration by running more backups with fewer policies
o Use multiple data streams in a configuration that contains large file servers with many file systems and volumes. Multiple data streams provide more backups with fewer policies than are otherwise required.
Considerations when backing up using Multiplexing and/or Multiple Data Streams
Job priority vs. differing multiplex levels:
If a set of jobs with a high level of multiplexing is started and is running to tapes, new schedules with lower levels of multiplexing will be forced to wait for all schedules with higher levels of multiplexing to complete, regardless of job priority.
There are 2 drives in the storage unit, set with "Maximum Streams per drive" of 8
Policy "Slowclients" with "Job Priority" of 10 starts using the "Full" schedule with 50 different clients and a Media Multiplexing setting of 8. 16 of the data streams will start writing to tape, with the remaining jobs being queued and awaiting resources.
Meanwhile, Policy "Fastclients" with "Job Priority" of 99 starts it's "Full Schedule" with 5 clients and Media Multiplexing set to 2.
In a non-multiplexed environment, as soon as a job finishes running, NetBackup will start the queued jobs in order of their Job Priority, then for queued jobs of equal priority, the job that has been queued the longest will start. In a multiplexed environment, the first criteria is the Maximum Multiplexing setting, followed by Job Priority, and finally time in queue.
In this scenario, as soon as one of the "Slowclients" streams completes, NetBackup will find the highest priority job with a Media Multiplexing setting of 8 (or higher) and start that job. The "Fastclients" backup will need to wait until enough of the higher multiplexing jobs are completed to bring the total number of data streams (including the new "Fastclients" stream) down to 2 (or less). In practice, this means that all but 4 of the "Slowclients" jobs will be completed before "Fastclients" can be started, even though the priority on "Fastclients" is much higher.
When the "Multiple Copies" option is used, each stream of data being written to tape needs to be identical, so the lowest "Maximum streams per drive" of the Storage Units in question will be used.
As noted above, multiplexing only applies to tape based Storage Units. As a result, if multiple copies are configured and one or more copies is written to disk, any tape copies will be limited to a single, non-multiplexed stream.
Multiplexing backups can increase the total throughput by consolidating streams from multiple clients to a single tape, allowing the tape drive to reach and maintain it's full write speed. Restores of multiplexed data will experience slower performance than non-multiplexed restores. This is due to reading all of the data that is written in a given multiplex set in order to restore a single stream of data within the set.
For example, if there are 3 data streams being written at the same speed, for a given restore, there will be roughly three times the amount of data that must be read from the tape in order to restore the needed information. This will also result in values for the "Current kilobytes read" field in the job details showing much higher than may be expected. Again using the example of 3 data streams, each with 10GB of data, a full restore of any one of the streams actually requires reading 30GB of data from the tape.
Using both multiplexing and multiple data streams:
As noted above, it is possible to combine multiplexing and multiple data streams in the same job. Whether this is desired or not depends on the specific environment in question. An example of a situation that would benefit from multiplexing and multiple data streams for the same backup would be for clients with a fast network connection, but multiple relatively slow local drives.
Testing backup and restore performance:
As with any tuning parameters, ensure that you test different combinations of settings in order to determine the best performance for your environment, both for backup speed and minimizing the negative impact on restore speeds inherent in multiplexed backups.
Article URL http://www.symantec.com/docs/DOC2799