The canary in the cave: How multiple jobs help in troubleshooting and trend analysis..
The canary in the cave
I often call Backup Exec The canary in the cave because it can be used to provide insight into the health of your environment. If properly configured, you can see trends in data growth within individual technologies such as Exchange or user shares. You can benchmark one system against the next and, over time, respond if one starts to decline in performance. To do this though you have to give up the pipe dream of one big job and actually run multiple jobs. In fact, one job per server is ideal.
This sounds like a lot of work?
Thankfully the kind and gentle developers at Symantec have provided us with policies. This term may sound unique to Backup Exec but is in truth it is self explanatory. What is your backup policy for your organization? Have you been instructed to do a full backup on the weekend with an incremental each day? Is this the same for all your servers? Are they all going to the same drive? Most people who are using one job do so because all their servers are being backed up according to one “Company Policy” So let’s create a policy.
Look before you leap!
Before you start the process of creating a policy, you need to know what your job settings will be. Start by looking at your original job to see how it is currently running. When you bring up your job properties, the first thing you notice are the 4 main categories written in blue on the left: Source, Destination, Settings, and Frequency. You want to go through the latter 3 and make note of your current settings. Mainly what drive, media set, and when the jobs run. Now exit of the job properties and create a new policy.
What Should I call my new policy?
If you are creating your first policy, there is no real meaningful name. Just call it a short version of your company name or leave the word policy. When you need to start making more policies, then define a naming convention based on your particular need to differentiate your backups. Note: This name, along with the template name and the selection list will be a part of every job created, so keep it short. Once you have given a name, click new template.
The first template has to be a backup template.
Notice this template has only the bottom three sections discussed earlier: Destination, Settings, and Frequency. Go through and match all your job setting from your one main job. Leave the backup method as full and give it a name of ‘Full’. Ok out of the template. Once again short is best for readability later on.
What about my incremental jobs?
Create another new template and once again make it a backup. This time call it Incremental and make the backup method incremental. Don’t forget to set the backup method for SQL and Exchange as well. Give it a daily schedule and OK out. Now you have a new policy with 2 templates. According to best practices you would backup to disk and then duplicate to tape, this duplicate job could be a third template. Once you have your policy defined, OK out of it and cancel when it wants to assign selection lists.
But don’t I need my selection list?
Yes, let’s go look at your selection list now. On your original job, take note of your selections. One server at a time, analyze what needs to be backed up. Close out and create a new selection list.
What do I call my new selection list?
For the name, use the name of the server and for the description, list what was selected. I.e. SS for system state, C for C drive, SCC for Shadow Copy Components, SQL, and InfoStore for Exchange. It might end up looking like this: Name: Exchange01 Description: C, D, SS, SCC, InfoStore. Now in general, I advocate one server per selection list but exceptions would include SQL or Exchange information stores where the full C, SS, and SCC would be part of one selection list for weekly backups and the InfoStore would be in a selection list by itself for daily or even hourly backups. Ok out once you have everything you want selected.
Now we create jobs.
So once you have your first selection list saved, right click your policy and select create jobs using policy. You will be prompted with all your selection lists, click the checkbox for the one you just created. Backup Exec will now create two new jobs, one for each template in the policy.
This is where you start to see the magic.
Once again, go through your old job, select a second server, and create a new selection list for it. Like before, select your policy and create new job using policy. You just created 2 more jobs with only a selection list. With each selection list, the job list grows as a multiple.
Watch out for that tree!
As you make the transition from one job to many jobs, there are a couple of caveats. You will need to ensure you are appending and overwriting properly to tape as the jobs will be running one after another. Also you need to think about the time window you set for your templates. Your one big job has one start time and one end time. Your many jobs will all queue up to start at the same time, but will be limited by the number of available devices. For example, say you have 10 servers thus 10 jobs but only one tape drive. All ten jobs will queue up to start at the same time but only one gets to go first. You need to think about what time your last job will start when you set the time window on each template.
Ok great, now I have 20 jobs, what did all this work buy me?
The biggest difference you will notice is the color of the job history pane. If you have one big job and it keeps failing because the file server is full, all you see is red X’s indicating job failures. True you may have plenty of valid restorable data, but it doesn’t look that way. If you had 9 servers each in their own job all backup successfully and that same file server failed, you would see one red X buried in what I call a “sea of green”. These little green checkmarks may seem like such a little thing, but only when you are not used to seeing them. They give me what I like to call a “warm fuzzy”. Managers especially like the warm fuzzy when they run their reports. Green check marks indicate that your backup environment is properly configured and operating correctly. When you have to troubleshoot multiple servers to figure out an issue, you want to rule out as many as possible right off the bat. The first server to rule out when a backup fails is always the backup server. If you have 9 successful jobs and one failed job, you know right where to start your troubleshooting. I call this the canary in the cave.
But wait, there’s more.
Looking at your job history pane, you can see a unique byte count for each individual job. After just a few runs, you can start to see data growth by picking a job for a particular server and selecting view reoccurring job instances. This information can be used for capacity planning, future budget needs, or analyzing trends on a per application or server basis.
You mentioned benchmarking?
Another piece of information is the job rate. Just like the larger job, this is an average of all the resources on a server. So the C drive might go at a 1 Gb/Min and the System State, which always goes slower, might run at say 600Meg/Min giving an average(not counting time) of 800 Meg/Min for the server. When each server in its own job, it gets its own job rate. You can compare your servers against each other as well as monitor them over time to watch for decreases in performance. Backup performance can help indicate issues on the network, full disks, low memory, or high processor utilization.