Don’t put the job before the media set.
If you ever find yourself having to delete files off disk to get a job to run, you need to read this article.
Don’t put the job before the media set.
When installing Backup Exec, many first time users go straight for the backup job. Their first goal is to create one big that backs up everything up. This works for about 1-3 weeks depending on disk size and then starts to fail due to lack of disk space. Backup Exec is sophisticated suite of technologies that allows you to reach into almost every aspect of your environment. It is a highly configurable application that requires some initial setup if you want performance in return. The first thing that should be configured is a media set. No! Make that two media sets. The problem is, new user don’t have a concept of what a media set is. Media sets are unique to Backup Exec. A media set helps you keep as much data as you can stuff onto your tapes or disk, while at the same time, preventing you from overwriting your data too soon. It is your way of telling Backup Exec how long you want to protect your data.
So what should I call my media set?
Once again the answer is two things: 1. Where is the data going, and 2. How long to keep it there. Now in truth, the ‘where’ is more of a Backup-2-Disk setting but makes sense for many media sets. The most common ‘where’ is Tape vs. Disk. The ‘how long’ is the heart of the media set. It defines how long to keep your data. So your first two media sets might be ‘Keep Data on disk for 4 weeks’ and ‘Keep data on tape for 7 years’.
What about my Overwrite protection period?
Now that you have a meaningful name for your media set, you can intuitively decide what to set the overwrite protection period for. That would be 4 weeks and 7 years for the example above. These numbers were picked for the sake of understanding the importance of giving meaningful names to your media sets. But how do you as a user decide on your overwrite protection period? Many of us are bound by law to protect our data for certain periods of time. Financial records are one such area of obligation. In situations like this, you just follow orders and buy tapes as you continue to need more. Most of us however are at the mercy of the bean counters. Your manager may ask you keep everything forever and ask you to do it with no new storage. Just smile and show him your chart showing 30%-50% year over year data growth. (Think byte count and reoccurring job history). So how do you determine settings are right for you? The fun answer is MATH! It’s simple really. How much data you have divided into how much space you have equals how many times you can store it. (Size/Space=Copies) Now throw in some incremental and differentials and you have a quadratic equation. Ok, not really. It’s just the same formula with the size broken into many parts.
So I have to run some jobs before I know what setting to use for my jobs?
Yes. Your backup environment will not be configured in one day. If you want to configure it and forget it, you will need to tune it. This means revisiting your numbers and watching closely as the first overwrite protection periods start expiring to see if you are predicting behaviours correctly. You may be able to run some numbers based off actual data size, but if your tape drive does compression, you will still have to run some actual jobs to determine your compression ratio.
But what should the ‘append period’ be?
Well the concept of appending only applies to sequential access media like tapes. On the other hand, random access media like say your hard disk drive will always be able to use any space that is no matter how small and where it resides on disk. Appending to disk only leads to file fragmentation. Sadly, today in 2011, file fragmentation, and thus high disk I/O, is still a leading cause of backup job failures. It is not uncommon to see 99% file fragmentation in environments where users have been appending to backup to disk files for over a year. Appending to disk should only be used if the B2D folder is set to pre-allocate.
So how can I prevent fragmentation of my backup to disk files?
A good method is the ‘Allocate max size’ (pre-allocate) setting on the Backup to Disk folder. This will make that first run a little slower. This means you might not get the data centre bandwidth record on your first run, but months down the road your jobs will still be humming along at full capacity. In personal testing, I saw 3X increase in performance by pre-allocating my files (650MB / Min vs. 2100MB/ Min going to a Stripped RAID with 2 SATA drives). Considerations here should include the size of the B2D files you are pre-allocating as well as the average size of data to be stored in the B2D files. You don’t want to pre-allocate a file of too large a size and then not use it. Examples of this are VMware and Exchange GRT jobs. They typically store just a small amount of data in the pre-allocated b2d file and then use an accompanying IMG folder to hold the database or vmdk. You also don’t want to use a file size too large for your system to handle efficiently. For 32bit operating systems, 4 gigs should be considered the max setting. If you do not want to pre-allocate or already have a disk full of fragmented backup files that you need to save, I suggest using contig.exe from the Microsoft Sysinternals Suite to maintain your backup to disk folder. Like any other maintenance, this should have its own window outside the backup window to prevent high disk I/O that can lead to file system corruption.