Video Screencast Help
Scheduled Maintenance: Symantec Connect is scheduled to be down Saturday, April 19 from 10am to 2pm Pacific Standard Time (GMT: 5pm to 9pm) for server migration and upgrades.
Please accept our apologies in advance for any inconvenience this might cause.

NBU 5220 Performance

Created: 31 May 2013 | 50 comments

We recently deployed a 5220 appliance into our environment as it was to be the savior in our battle against a backup window we were no longer able to meet.  When we finally got it online and into our NBU environment the initial performance was great.  The area we were to most benefit was with VMware backups.  The data stores mounted directly to the appliance would allow direct access to the snapshots for a fast and efficient backup.  Before this, we were performing client side backups so the impact on the hosts every night was significant as we tried backing up 600+ vms.  The plan was to be able to move all Dev and test off-host backups to the middle of the day as the performance impact was minimal and the end result was an increased window to complete things.  As we started with noon-time backups deduplication rates were high and so were the speeds.

 

However, this performance gain was short lived, as we began increasing the load we suddenly saw performance drop to a point of concern.  Backups were no longer speedy, 3500KB per second to 10,000KB.  There are some that might pop to 24,000KB, but in a sample size of 15 as I write this, only one is showing 24,000.

Now, I do have a few ideas. 

1. We have a 72TB appliance, therefore, 2 disk trays, during the backups, the one disk tray is going crazy, all the lights are flashing and you can really see that it is working.  However, the second tray is doing nothing.  While you might see a blink here or there, it is almost nothing compared to the other disk tray.  Is this to be expected?  When we looked at the disk configuration, it shows concat, is this normal?

2. Too much data at once and we are simply burying the appliance.  In reality, what sort of performance should I be able to expect from the appliance?

3. Relating to number 2, since we only have the one appliance right now while we wait to get the remote appliance in place, we are duping off to tape.  This is running at the same time as a backup, so this means at the same time the appliance is writing a lot of data, it is also reading it back to tape.

4. We are overloading the data store so that read speed is bad from source to destination.  We have fewer hosts,therefore, if we limit the jobs per host we limit the number of machines backing up at once (obviously).  This means that backups take way long, so we removed the limit per host and just set a limit per data store.  As we are new to it all, I am not sure what impact where, but again, I am trying to list any and all ideas from the start.

5. The appliance does not support multi-pathing, therefore, we only have a single path to the disk.

 

Beyond that I am not sure, but this is something that doesnt help with the showcasing of the appliances to management at the moment.  However, given the initial performance I am confident we can get back there.

Operating Systems:

Comments 50 CommentsJump to latest comment

Mark_Solutions's picture

Lots of things to cover there but here are a few ideas which may help.

Firstly, in relation to one shelf getting used and the other not I am unsure of why this would be the case unless you added the second shelf at a later date and most data was already on the first shelf (being de-dupe it may not be adding to the capacity of the appliance and just carry on using the first shelf) The one thought is that the first shelf wil also most likely have the de-dupe database on it which is doing most of the work as if you are getting good de-dupe there shouldn't be much data written but the database should be getting a hammering!

Second - do check that the appliance itself is OK. Two things to check here, 1. The disks, a lost disk will substantially slow the thing down. 2. The RAID battery - massive slow down if this starts to go flat. Use:

 /opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -getbbustatus -a0

to check its state.

If you are running a lot of jobs with a lot of data then it may be that queue processing, rebasing and garbage collection is starting to slow things down.

Queue processing happens all the time (twice a day) but it never seems to clear enough down so it is well worth running this more often manually to keep it lean (/usr/openv/pdde/pdcr/bin/crcontrol --processqueue) and it may help speed it back up. Worth checking regularly how big your queue is - i feel that it should run far more often than every 12 hours and that actually midnight is not usually a great time to run as backups are usually running then - it would be best to run it 4 or 5 times a day during the day time, or at least when backups are not running.

rebasing and garbage collection do not run so often (monthly) so when these kick in they can also slow it down for a day or two so you may want to run these manually more regularly too.

Running duplications at the same time as backups will have an adverse affect too - when 7.6 comes out we will be able to schedule the SLPs but for now you may want t just restrict the I/O on the disk pool to prevent too many operations running on the pool at the same time.

Reducing the fragment size of the de-dupe disk storage unit also seems to help - i try and keep them to aroung 5000MB for best performance (mainly helps with duplications / replications)

The current thinking is an optimum limit of one VM per data store at a time for best performance when backing up - so anything more than that will have an impact on performance

Again, when 7.6 is here you will get the option of using Accelerator for VMWare backups which will change everything!

Hope some of this helps

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

smakovits's picture

Mark,

Funny you should mention the RAID battery as I got this email today:

                             Adapter Information                             |
|+---------------------------------------------------------------------------+|
||  |                    |Adapter| BBU  |BBU Learn | BBU  |      |           ||
||ID|   Adapter model    |Status |Status|  Cycle   |charge|State |Acknowledge||
||  |                    |       |      |  active  |      |      |           ||
||--+--------------------+-------+------+----------+------+------+-----------||
||  |Integrated Intel(R) |       |      |          |      |      |           ||
||1 |RAID Controller     |OK     |Not OK|-         |-     |Failed|No         ||
||  |SROMBSASMP2         |       |      |          |      |      |     

 

 

However, I run your command and things look to be OK.

/opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -getbbustatus -a0

BBU status for Adapter: 0

BatteryType: iBBU
Voltage: 4064 mV
Current: 0 mA
Temperature: 38 C

BBU Firmware Status:

  Charging Status              : None
  Voltage                      : OK
  Temperature                  : OK
  Learn Cycle Requested        : No
  Learn Cycle Active           : No
  Learn Cycle Status           : OK
  Learn Cycle Timeout          : No
  I2c Errors Detected          : No
  Battery Pack Missing         : No
  Battery Replacement required : No
  Remaining Capacity Low       : No
  Periodic Learn Required      : No
  Transparent Learn            : No

Battery state:

GasGuageStatus:
  Fully Discharged        : No
  Fully Charged           : Yes
  Discharging             : Yes
  Initialized             : Yes
  Remaining Time Alarm    : No
  Remaining Capacity Alarm: No
  Discharge Terminated    : No
  Over Temperature        : No
  Charging Terminated     : No
  Over Charged            : No

Relative State of Charge: 96 %
Charger System State: 49168
Charger System Ctrl: 0
Charging current: 0 mA
Absolute state of charge: 96 %
Max Error: 2 %

Exit Code: 0x00

I have a case open with support and I send then the logs to dispatch a part if needed.

Another are I would like to maybe discuss is the queue processing, rebaisng and garbage collection.

Any chance you can elaborate on each just briefly for what each does and the manual commands for each?

I did run /usr/openv/pdde/pdcr/bin/crcontrol --processqueue and it returned OK, not sure if that means it is done or what.

Lastly, something I discovered while getting the logs for support is that I have a gagillion temp vmdk files in the tmp folder.  A quick search revealed that this might be an issue.  https://www-secure.symantec.com/connect/forums/522...

Which you spoke on, so I am curious if you know anything more specific on the EEB for vmware.  My current appliance version in 2.5.2.

At some point it might be worth investigating the fragment size you mention as well.

On the final point of one VM per data store, I will try this setting now, just to have it in place.  I had it at 4...but in the long run I think the battery issue needs hashed first and then the queues, fragment size and orphaned vmdk's in the tmp folder.

 

 

 

Mark_Solutions's picture

Thanks for getting back to me ... if the battery is playing up it will cause you real issues ... and it will need the new one to be fitted, allowed to charge and then a reboot to make the RAID cache fully operational - this will greatly improve speed.

The VMWare EEB is ET2982308, worth just asking support is they have a 2.5.2 version for the appliance while you have a case open with them (unless the files are there from when the system was on 2.5.1 - are they new or old?)

So on to queue processing ... the queue size is shown in kb and the size of the queue can be seen by running:

/usr/openv/pdde/pdcr/bin/crcontrol --queueinfo

This shows the size of the queue, the larger it is the less efficient the system can be as it means it has a lot of transactions to work through.

To see if it is currently processing the queue run the following (which shows if it is running and if one is queued):

/usr/openv/pdde/pdcr/bin/crcontrol --processqueueinfo

If the queue is large (GB's) and it is not being processed then run it twice (the command i gave yesterday) - this will fire one off and queue another. It can take 5 or 6 runs to bring the queue right down.

To see how you de-dupe data sizes are you can run either of the following - the second giving a little more detail:
/usr/openv/pdde/pdcr/bin/crcontrol --dsstat
/usr/openv/pdde/pdcr/bin/crcontrol --dsstat  1

If you want to manually run garbage collection (this is like an image cleanup job for de-dupe but actually runs only occasionally [monthly] you can use the following - but do this via an IPMI session as you have to leave the command until it completes so it stops you doing anything else
/usr/openv/pdde/pdcr/bin/crcollect -v -m +1,+2

Rebasing is like a defrag for de-dupe - it tidies things up and, like a defrag, makes things run quicker

To check state run /usr/openv/pdde/pdcr/bin/crcollect –rebasestate

A useful command is to do the following after you have kicked off queue processing:

tail -f /disk/log/spoold/storage.log

You can then watch the log as it processes each set of transactions - this should process each batch on a few seconds (maybe up to 20 seconds) - and more than this means the system is slow.

Hope all this helps

 

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

smakovits's picture

Mark,

Awesome information!  Just a few things as I work through this all.

# /usr/openv/pdde/pdcr/bin/crcontrol --queueinfo
total queue size : 14634764929
 

# /usr/openv/pdde/pdcr/bin/crcontrol --processqueueinfo
Busy   : no
Pending: no

# /usr/openv/pdde/pdcr/bin/crcontrol --processqueue (twice)

# tail -f /disk/log/spoold/storage.log
tail: cannot open `/disk/log/spoold/storage.log' for reading: No such file or directory
tail: no files remaining
 

# /usr/openv/pdde/pdcr/bin/crcontrol --queueinfo
total queue size : 14641550937
 

So, essentially, I am not certain as to what this means exactly.  Is it working or not or did I do something wrong?

As for the VMDK's in the tmp folder, some are older, end of April, while others are as recent as the end of May.  Therefore, I will contact support about the EEB.

While on the topic of the TMP, do those files need to be gracefully cleared out or is there a way to just delete it all?  Or is the data relevant?

File names are:

errfile_xxxxxx

infile_xxxxxx

outfile_xxxxxx

and then some xxx_filelist files

and 897 VMDK files

In total the tmp folder has 22,022 files.

And very lastly for now,

/usr/openv/pdde/pdcr/bin/crcontrol --dsstat  1

just hangs and I get nothing back, even waiting several minutes.  I thought it was the extra space you had, so I removed it and still, the results are the same.

 

 

 

 

 

smakovits's picture

Oh thing I did manage to find is the log file.  While in the system I thought to go look for it and found we were missing a "d"

# tail -f /disk/log/spoold/storage.log

# tail -f /disk/log/spoold/storaged.log

June 04 12:54:04 INFO [1082194240]: Transaction log 2086858-2094268: 53500000 of 245142860 entries processed.
June 04 12:54:08 INFO [1082194240]: Transaction log 2086858-2094268: 53600000 of 245142860 entries processed.
June 04 12:54:13 INFO [1082194240]: Transaction log 2086858-2094268: 53700000 of 245142860 entries processed.
June 04 12:54:17 INFO [1082194240]: Transaction log 2086858-2094268: 53800000 of 245142860 entries processed.
June 04 12:54:22 INFO [1082194240]: Transaction log 2086858-2094268: 53900000 of 245142860 entries processed.
June 04 12:54:27 INFO [1082194240]: Transaction log 2086858-2094268: 54000000 of 245142860 entries processed.
June 04 12:54:34 INFO [1082194240]: Transaction log 2086858-2094268: 54100000 of 245142860 entries processed.
 

 

So not sure if this means it is working, but that is the output that is scrolling by.

 

 

smakovits's picture

Also, found what what happening.

/usr/openv/pdde/pdcr/bin/crcollect –rebasestate  --- (failed)

# /usr/openv/pdde/pdcr/bin/crcontrol --rebasestate
Image rebasing: ON
Rebasing busy: Yes
 

Mark_Solutions's picture

OK - glad you spotted by deliberate mistake with the missing d in storaged.log!!

The tail -f shows that it is processing fairly quickly which is good.

The queue size is huge!! and is not showing as busy or pending so really need kicking off manually regularly to try and get it trimmed down - i see you kicked it off manually twice which is good so presumably it will be a bit smaller by now - keep on top of it to see how small you can get it - if there is not at least one pending then fire it off again manually.

The rebase state is interesting as it shows it is ON and processing and yet has failed - not sure about that one, it may be worth turning it off and back on again (never hear that IT expression before!!) - just use -crcollect -rebaseoff then leave it a few minutes and use crcollect -rebaseon.

Then check what the -rebasestate says

While asking support about the VMWare EEB ask then about the telemetry EEB too - i thought it was included in 2.5.2 but worth checking if you have a lot of stuff in /tmp although NetBackup itself does use the directory for its general cache location while doing normal processing.

As for deleting the stuff i usually cd into /tmp and then just do a rm *.vmdk or similar, just best if no jobs are running at the time as the files may actually be in use.

Finally, i think, is the crcontrol --dsstat 1 .... that is the right command but the issue may be the huge transaction queue you have - it could take a LONG time to return a result! Maybe wait until you have trimmed it down again and see how it goes then.

Worth mentioning the rebase failed state to support anyway

Hope all this helps

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Andrew Madsen's picture

Minor correction Mark. That should be crcontrol (/usr/openv/pdde/pdcr/bin/crcontrol) not crcollect for the rebase switch.

The above comments are not to be construed as an official stance of the company I work for; hell half the time they are not even an official stance for me.

Mark_Solutions's picture

Thanks Andrew - don't think my fingers are working right just lately - LOL!

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

smakovits's picture

OK, for starters, I finally got the battery replaced yesterday and am seeing some odd behavior, however, it could potentially be normal.  Should the RAID battery charge and de-charge during use/backups?  If yes, then my concern is gone, but if no, then I guess it is back to support.  I did mention this to them too so they might respond as I am writing this, but as far as this thread was concerned, I thought it was worth the mention.  Here are 3 snippets from the emails sent from the 5220 after the battery was replaced.  As you can see, it starts to charge up, but then later in the day loses the charge while the backups are running.

+--------------------------------------------------------------------------------------------+
|                                      RAID Infomation                                       |
|+------------------------------------------------------------------------------------------+|
||ID|Name|Status |Capacity| Type |Disks|Write Policy|Enclosure|HotSpare | State |Acknowledge||
||  |    |       |        |      |     |            |   ID    |Available|       |           ||
||--+----+-------+--------+------+-----+------------+---------+---------+-------+-----------||
||  |    |       |        |      |0 1 2|            |         |         |       |           ||
||3 |VD-0|Optimal|4.541TB |RAID-6|3 4 5|WriteThrough|0        |yes      |Warning|No         ||
||  |    |       |        |      |6    |            |         |         |       |           ||
|+------------------------------------------------------------------------------------------+|
|                              Adapter Information                                           |
|+-----------------------------------------------------------------------------+             |
||  |                     |Adapter| BBU  |BBU Learn | BBU  |       |           |             |
||ID|    Adapter model    |Status |Status|  Cycle   |charge| State |Acknowledge|             |
||  |                     |       |      |  active  |      |       |           |             |
||--+---------------------+-------+------+----------+------+-------+-----------|             |
||  |Integrated Intel(R)  |       |      |          |      |       |           |             |
||1 |RAID Controller      |OK     |OK    |No        |41 %  |Warning|No         |             |
||  |SROMBSASMP2          |       |      |          |      |       |           |             |
|+-----------------------------------------------------------------------------+             |
+--------------------------------------------------------------------------------------------+

 

Time Monitoring Ran: Wed Jun 5 19:21:05 2013 EDT

+--------------------------------------------------------------------------------------------+
|                                      RAID Infomation                                       |
|+------------------------------------------------------------------------------------------+|
||ID|Name|Status |Capacity| Type |Disks|Write Policy|Enclosure|HotSpare | State |Acknowledge||
||  |    |       |        |      |     |            |   ID    |Available|       |           ||
||--+----+-------+--------+------+-----+------------+---------+---------+-------+-----------||
||  |    |       |        |      |0 1 2|            |         |         |       |           ||
||3 |VD-0|Optimal|4.541TB |RAID-6|3 4 5|WriteThrough|0        |yes      |Warning|No         ||
||  |    |       |        |      |6    |            |         |         |       |           ||
|+------------------------------------------------------------------------------------------+|
|                              Adapter Information                                           |
|+-----------------------------------------------------------------------------+             |
||  |                     |Adapter| BBU  |BBU Learn | BBU  |       |           |             |
||ID|    Adapter model    |Status |Status|  Cycle   |charge| State |Acknowledge|             |
||  |                     |       |      |  active  |      |       |           |             |
||--+---------------------+-------+------+----------+------+-------+-----------|             |
||  |Integrated Intel(R)  |       |      |          |      |       |           |             |
||1 |RAID Controller      |OK     |OK    |No        |2 %   |Warning|No         |             |
||  |SROMBSASMP2          |       |      |          |      |       |           |             |
|+-----------------------------------------------------------------------------+             |
+--------------------------------------------------------------------------------------------+

 

Time Monitoring Ran: Thu Jun 6 02:21:04 2013 EDT

+--------------------------------------------------------------------------------------------+
|                                      RAID Infomation                                       |
|+------------------------------------------------------------------------------------------+|
||ID|Name|Status |Capacity| Type |Disks|Write Policy|Enclosure|HotSpare | State |Acknowledge||
||  |    |       |        |      |     |            |   ID    |Available|       |           ||
||--+----+-------+--------+------+-----+------------+---------+---------+-------+-----------||
||  |    |       |        |      |0 1 2|            |         |         |       |           ||
||3 |VD-0|Optimal|4.541TB |RAID-6|3 4 5|WriteThrough|0        |yes      |Warning|No         ||
||  |    |       |        |      |6    |            |         |         |       |           ||
|+------------------------------------------------------------------------------------------+|
|                              Adapter Information                                           |
|+-----------------------------------------------------------------------------+             |
||  |                     |Adapter| BBU  |BBU Learn | BBU  |       |           |             |
||ID|    Adapter model    |Status |Status|  Cycle   |charge| State |Acknowledge|             |
||  |                     |       |      |  active  |      |       |           |             |
||--+---------------------+-------+------+----------+------+-------+-----------|             |
||  |Integrated Intel(R)  |       |      |          |      |       |           |             |
||1 |RAID Controller      |OK     |OK    |No        |8 %   |Warning|No         |             |
||  |SROMBSASMP2          |       |      |          |      |       |           |             |
|+-----------------------------------------------------------------------------+             |
+--------------------------------------------------------------------------------------------+

Ed Carter's picture

Having experienced 4 failed batteries, yes the decharge is normal and is all part of the learn cycle. What Symantec never seem to warn about is the time it takes for this process to complete (while its running, for obvious reasons it keeps disabled the writeback policy and runs like crap). I've had it take 15 hours to complete before and its really impacted our backup window! Basically as soon as the policy shows as Writeback decent performance should soon return.

We too experienced poor performance when we first got our 5220's (although we only have a single disk tray so cannot comment on your issue there..). Our issues were caused by a number of factors. See my previous post which has come great suggestions from the guys on this forum.                     

SLP tuning - If you are running AIR or tape duplications this helps. I suspended all SLP processing for the 1st period of my window to get a bulk of the backups through and then start via the task scheduler. I also use the Lifecycle_parameters files in db\config to batch things up. This really helped and keeps the noise in the activity monitor to a minimum..

Limit I/O on disk pool. This was the biggest help for performance and I have tuned it to 35 streams.

Use query based client selection and use Vmware resource limits in the master properties. We limit to 2 concurrent backups per datastore. The query selection balances load accross the virtual environment and despite annoying features like being unable to rerun failed jobs from the monitor this has been worth doing in our environment.

External factors. We had a i/o issue on the LUN that had out VC database on it. When the backups were hitting the virtual enviroment all the VC logging etc was creating a lot of noise on the DB which created performance issues all round..

Of course you may just have issues with your appliance(s) but hope this helps. I went through a whole heap of pain but happy to say all is finally running well. Definitely a good idea to do some manual queue processing as mentioned above..

Just waiting for a battery to fail at 6pm on a Friday night now..smiley

Cheers

Ed

smakovits's picture

Thanks for the Reply Ed.  Certainly good to know the batteries are ultra reliable...

 

Currently, in order to track performance issues, I tunes VM jobs to 1 per datastore, but for obvious reasons, of 5 jobs I am looking at now, I see 8, 9, 10, 12, 11 MB/sec...sweet, I know!

Can I ask what your performance is.  I know I wont be the same, but one thing nobody can tell me is what I can expect, which annoys me. 

Of interest would be the DB issue you had too, so your just saying the LUN the SQL db was on, or is it something else?  Just curious as it would be something to note in the back of my head once I am past the battery issues and queue processing issue.

smakovits's picture

Now, in order to avoid clutter with the above logs, I wanted to split out to a new response for the rest of the stuff I have.

VMDK EEB, got it from support, waiting to apply it when no jobs are running.

Telemetry EEB, "From all notes I have seen the issue with telemetry data was supposed to be addressed in 2.5.2. There is no telemetry EEB for 2.5.2 at this time."

My queue is still insane and not going anywhere:

# /usr/openv/pdde/pdcr/bin/crcontrol --queueinfo
total queue size : 6050796993
creation date of oldest tlog : Thu Jun  6 00:54:08 2013
 

It is not even working and I kicked it off twice again as suggested.

/usr/openv/pdde/pdcr/bin/crcontrol --processqueueinfo
Busy   : no
Pending: no
 

I took the suggestion of turning and on the rebase process.  I actually left it off for several minutes,

# /usr/openv/pdde/pdcr/bin/crcontrol --rebasestate
Image rebasing: ON
Rebasing busy: Yes
 

On the battery front, this is the latest:

# /opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -getbbustatus -a0

BBU status for Adapter: 0

BatteryType: iBBU
Voltage: 4069 mV
Current: 0 mA
Temperature: 39 C

BBU Firmware Status:

  Charging Status              : None
  Voltage                      : OK
  Temperature                  : OK
  Learn Cycle Requested        : No
  Learn Cycle Active           : No
  Learn Cycle Status           : OK
  Learn Cycle Timeout          : No
  I2c Errors Detected          : No
  Battery Pack Missing         : No
  Battery Replacement required : No
  Remaining Capacity Low       : No
  Periodic Learn Required      : No
  Transparent Learn            : No

Battery state:

GasGuageStatus:
  Fully Discharged        : No
  Fully Charged           : Yes
  Discharging             : Yes
  Initialized             : Yes
  Remaining Time Alarm    : No
  Remaining Capacity Alarm: No
  Discharge Terminated    : No
  Over Temperature        : No
  Charging Terminated     : No
  Over Charged            : No

Relative State of Charge: 99 %
Charger System State: 49168
Charger System Ctrl: 0
Charging current: 0 mA
Absolute state of charge: 98 %
Max Error: 2 %

Exit Code: 0x00
 

 

I think that is it for now.  Support did respond and requested another DataCollect.

 

 

Ed Carter's picture

Yeah thats not great. Generally if I get 20,000 or above I see that as acceptable for our window. I'd say on average we get about 40,000 kb/ps over SAN transport but do get a lot of variance. The range is from 20,000 all the way up to 130,000 kb/ps. If the I/O streams at the max say 20k - 50k. If less going on we hit some of the higher figures.

We finally got rid of tape after moving our 2nd 5220 to our second data centre and use bi directional AIR to effecively offsite our backups. I've found tape duplication certianly has a negative impact due to it have to rehydrate the data and pull the various blocks from all over the disk (this is where the rebase comes to play to minimise this as best it can). As AIR is only the unique blocks the read hit is much less.

In terms of the VC database, we found it was sharing a LUN with a couple of other SQL servers and just wasn't getting the IOPs that it needed. We are running on an EMC CX4-480 which doesn't have the best reporting for IOPs so had to dig around to pull the stats (soon going to VNX which should be much better). When the backups were in flight we found a marked decrease in Vcentre performance which had a knock on effect for backup performance as well as reliability. We've actually given the server a dedicated LUN now and it much improved our situation.

Best bet is to check out the data path all the way from source to target. Get your SAN stats (not just from vmware but from the actual array) to see if its touching the sides then work back to your SAN swiches (port errors, faulty fibre etc).. I'm of course assuming you're using SAN and not nbd. If your using nbd I think it'll be using the ESX host management network for the backup so maybe check that out too.

My problem was that we are a windows house and going to the 5220 on Linux was a bit of a step into the unknown and I felt pretty helpless when trying to troubleshoot potential non netbackup issues on the applliance (for example disk utilisation, memory etc etc). My advise is to do all you can to rule out external factors in your environment. Symantec support can be extremely hard work but speak to your account manager. They are really kean to prove how great their appliances are and I'd be very surprised if they wouldn't send an appliance engineer to your site for the day to assist you if you express your disatifsfaction (they did with us and were prepared to subsequently send a "technical expert" guy until we got things running well).

Hope this helps.

 

Ed

smakovits's picture

just found this little guy...

 

my webui was failing to display data

 

http://www.symantec.com/business/support/index?pag...

Brook Humphrey's picture

First off it's good to keep up with the late breaking news page:

http://www.symantec.com/docs/TECH145136

There are a couple publicly released EEB's in there that would be good to install. 

The one you mention and also the one for disk i/o alerts.

 

Now to discuss more your issue.

 

1. Rebasing will help to speed up the duplicaiton to tape once it completes. However if you are duping off to tape right after the backup it is likely that rebasing is not keeping up.

2. I did not see above how much space you are using on your dedupe pool so is it less than 32TB or are you using up all 72TB already. This will definitely effect the speed of your system also.

3. do you have the dedupe pool using the 4TB of internal disk on the appliance also or is it only using the external storage for the dedupe pool? The internal disks are running at 3GB/s were as the external are running at 6GB/s. If you use the internal for your dedupe pool then it will downgrade the performance of the whole dedupe pool to 3GB/s for the disk subsystem.

A good option and the recommended path to use if you are duping off to tape is to use the internal disks for an advanced disk and then use the external storage shelves for dedupe only. Set up your SLP to do the backup to the advanced disk and then do your dupe to tape from there and also your dupe to the dedupe pool from there. Have it expire the image after duplicaiton.

This will give good backup and duplicaiton to tape performance as well as keeping the performance up for the dedupe pool itself. 

4. I don't see any examples above of what the rest of your infrastucture is like but the things pointed out in point 3 above should cover quite a bit. It should remove most of your current bottlenecks and help your systems to perform faster. Above this it is likely you would have to start looking at other aspects of your backups to find out were issues may lie. 

a. what kind of data are you bcking up?

b. what kind of dedupe rates are your getting?

c. what kind of performance are you actually getting out of your network? Have you done any testing on this to see?

Anyway if you have any further questions please let me know. 

 

Thanks 

 

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

smakovits's picture

Brook,  some awesome data there.  To be 100% honest, I did not do any of the configuration as it was done by the UNIX admin that is the backup admin.  I administer the Windows side of things so I dont mess with a lot of the tapes, devices, etc.  However, I do have access to do it as needed, as long as I know what we are doing, its impact, etc.  So, now that that is cleared up, let me see if I cannot respond to your post.

Well, before I do, I should report that something is definitely screwed up somewhere because I just checked my queue info and get the following.

# /usr/openv/pdde/pdcr/bin/crcontrol --queueinfo
total queue size : 15433220155

OK, now on to your thoughts.

1. This is most likely the case, the dupe jobs as running forever and all the time.  Untill we get the second appliance in place and replicating, there is an obvious need to dupe to tape...
 

2. The /usr/openv/pdde/pdcr/bin/crcontrol --dsstat  1 will not return a value, and I assume this is where this information is, so I guess I will ask if there is any other way to get the information?  Regardless I do not believe we are full, closer to 32 that 64 I believe.  (42TB used, just looked)

3. I dont believe the 4TB of internal disk is in the dedupe pool. 

 

Partition

Total

Used

Available

%Used

 

Partition

Total

Used

Available

%Used

AdvancedDisk 0 GB 0 GB 0 GB 0
Deduplication 64 TB 42.300 TB 21.699 TB 67
Unallocated 11.482 TB - - -

Is it difficult to setup the SLP you describe because I would be all for it if you feel that it would help with the speed of backups.  Now, is this mentioned solution only for while duping to tape?  If so, that is fine, just more for informational purposes.  But I would certainly still be for exploring this and getting it put in place or at least testing it.

 

4. a. One server that is slow all the time is a file server with thousands of small files.  Now, I know this impacts client side backups in a big way, does it do the same for vmware backups as well?

b. Is there an easy way to get you the overll number?  I know looking in the admin console I see 30% - 100%, lots of 70s and 80s and 90s.

c. what sort of testing do you mean?  Source destination from the appliance to disk sort of stuff?  During backups I can see rates of 6000KB to 100,000KB for vmware backups. 

c2.  For client side backups I have a machine I am testing with that is currently able to do 90+MB/sec backup to take as a physical server.  If I install SEP, this decrades to 60-70MB, so I have a case open with SEP support on that.  Does SEP and the firewall have any impact on a vmware backup?  I assume no, but cant be certain.

 

So, I think for starters, if we could execute 3, that might be a good test.

Brook Humphrey's picture

2. Just use crcontrol --dstat instead as it will return what you need

3. In the web admin gui were you got that info from(or did you get it from clish) anyway you can resize and move partitions. In the web gui on the left hand side it will also show which storage is being used and how much. On the left you have the options to move and resize. In this case you should be able to simply move the volume to the external storage. 

Yes this way of setting up SLP's is specifily if you are duping to tape and will not be needed when you get replications going.

4. a. Actually the applainces seem to handle quite a large ammount of vm backups quite well. I have them in use at multiple sites and they are able to keep up with the load quite well. We also make use of autodiscover quite a bit.

b. Thats a little diffucult. If you have opscenter analytics then I have some custome reports that will get this data for you but it's enough to say that the lower the dedupe rate the more it will impact your performance.

c. Symantec support has a tool called appcritical that is a network testing and performance battery test. It returns more than just performance numbers and gives a good overall idea of how your network is working. 

c2. So things like SEP will greatly impact the performance of the client while it is doing fingerprinting. To work around this exclude the netbackup client install folder so that it is not constantly scanning the clients working files and processes.

For vmware backups. If it does impact these I have not noticed and my vm backups seem to run snappy enough that I have never thought to check on this.

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

smakovits's picture

OK, I will look into 3 on Monday with the other admin and start there.

 

While poking aroung the webUI I came across this, do it have any impact on the stuff we are doing?

Enable the SAN Client FT media server [Fibre Transport for backups to this appliance]

Enable the Fibre Transport to a Deduplication appliance [for duplication or for backups]

smakovits's picture

Lastly, looking at this, is it accurate,

Device

Total

Unallocated

Status

Device

Total

Unallocated

Status

Appliance Operating System (sda) 930.38 GB - In Use
Base Unit Storage (unit_1) 4.5429 TB 0 GB In Use
Expansion Unit Storage (unit_2)   35.470 TB 0 GB In Use
Expansion Unit Storage (unit_3)   35.470 TB 11.482 TB In Use

I cannot use all 72TB for dedupe right?  only like 64TB?

 

Then, in the resize,

Note : The estimated time to resize the partition is 2 to 5 minutes.
The NetBackup processes are stopped before this operation begins, and are restarted automatically after this operation has completed. The NetBackup domain does not run any jobs during this time, and jobs that are currently in progress fail.
 
Partition: AdvancedDisk  
Current Size: 0 GB   
Used Size: 0 GB  
Unallocated Size: 11.482 TB  
New Size: *  
NetBackup Storage Unit Name: *  
NetBackup Diskpool Name: *

Looking at this, the 4TB internal disk is not available for use.

smakovits's picture

Here was a report:

************ Data Store statistics ************
Data storage      Raw    Size   Used   Avail  Use%
                  64.0T  61.4T  36.3T  25.0T  59%

Number of containers             : 206447
Average container size           : 221773843 bytes (211.50MB)
Space allocated for containers   : 45784544578190 bytes (41.64TB)
Space used within containers     : 45559891833423 bytes (41.44TB)
Space available within containers: 224652744767 bytes (209.22GB)
Space needs compaction           : 6713346381044 bytes (6.11TB)
Reserved space                   : 2910345928704 bytes (2.65TB)
Reserved space percentage        : 4.1%
Records marked for compaction    : 136169741
Active records                   : 632878871
Total records                    : 769048612
 

It would appear my appliance is not doing a very good job compacting...

Brook Humphrey's picture

Yes if you esx is using a san backend then you may get better backup performance by enabeling the fibre transport features on the appliance.

So for the internal disk 0GB is free so you are definately using it. For the 64TB limit you can go over this somewhat as some of it will be used for maintenenace the applaince performs. But for hte most part yes. In this case if you need to resize you can and then move your dedupe storage to the external shelves.

Well you may need to resize some but to actually move it you will need to use the move option not resize. Resize will just remove the space from the end I believe. In this case the second external shelf.

 

Thanks 

 

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

Brook Humphrey's picture

Thanks. You not anywere near capacity and should be able to resize and move your dedupe volume easily.

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

smakovits's picture

OK, couple of things.

The first is certainly this issue with my queue.  Today it is grown to 2490313103.

So the question here, is this directly impacted by the fact that I am duping to tape?  If yes, I obviously need to stop that or find a fix such as the one noted above maybe.

Second is the discussion of the external Disk.  How can I look at what is using it?  Looking at the advanced disk option it is set to zero, so there is no advanced disk.  I would almost believe something is mis-configured.  Once that is fixed, I believe I relate to item one in the configuration of the jobs to utilize the disk for the purpose of off loading the work within the appliance so I can get some performance back.

 

lastly, regarding resizing and moving the dedue volume, what is involed and what am I gaining? 

smakovits's picture

Brook,

 

Here is a better breakdown of the storage.  I assume one issue would be the fact that the 4TB Base 1 is used for de-dupe?  I assume this is part of the issue you mentioned with bandwidth 3gb vs 6gb.  So I believe we obvioulsy want to change this first and gain back that disk for our other plans.

 

Base Unit Storage (unit_1)
-------------------------------
Catalog      :      1 GB
Deduplication: 4.5419 TB

Expansion Unit Storage (unit_2)
-------------------------------
Deduplication: 35.470 TB

Expansion Unit Storage (unit_3)
-------------------------------
Deduplication: 23.987 TB

----------------------------------------------------------------
Partition     | Total       | Available   | Used        | %Used
----------------------------------------------------------------
AdvancedDisk  |        0 GB |        0 GB |        0 GB |     0
Deduplication |       64 TB |   18.564 TB |   45.435 TB |    71
Unallocated   |   11.482 TB |        -    |        -    |    -
 

Brook Humphrey's picture

The queue is not such an issue really. The main thing is that you will want to install the eeb above so that the web gui will be able to properly collect data and report back to you.

As far as the rest it's not really an issue with being misconfigured. You didn't set it up wrong really but for your application there are things you can do to improve your perormacnce. So it just needs ot be reconfigured to better take advantage of your resources.

1. Try to move the deduplication volume to just the external shelves.

2. Once the move completes then allocate space for the advanced disk.

3. Then set up your slp's to use the advanced disk for backup then duplicate to tape and to the deduplicaiton storage. Set the backup to advanced disk to expire after duplication. 

 

If you have any other questions let me know

 

Thanks

 

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

smakovits's picture

OK.  Well, I guess I will start with the move.  I used the webgui and clicked deduplication, told it move.  I then had the option of source and destination, so I selected base unit storage and then the expansion unit (unit_3).  I specified the TB number that is on unit_1 and said OK.  I was then prompted with a message that said this can take 3 to 20 hours and I said OK again.  After that, I was back at the webgui, so I guess my question is what do I do now?  How do I know what the status of the move is?  Or, do I just keep checking the web gui till is shows my 4.5TB unallocated?

 

Secondly, I did apply the EEB last week for the GUI issue and things appear OK there now as well.

 

smakovits's picture

Yes if you esx is using a san backend then you may get better backup performance by enabeling the fibre transport features on the appliance.

Brook,

 

Regarding this comment, does this still apply if we are duping off to tape?  I was talking with one of the other admins here and he raised the question that he thought it breaks the ability to write to tape if I was to enable these features.  Therefore, waiting till we have the second appliance in place instead.  Thanks

smakovits's picture

I guess this tells me...

 

Alerts from NetBackup Applianace

  Operation:  Move Deduplication unit_1 unit_3 4.5419 TB
  Status:     Failed

- NetBackup Applianace Alerts

Brook Humphrey's picture

Yes I was kind of wondering about that you may need to resize it although it looked like you had plenty of space and should not have to.

With that failure to move it would be good to get a support ticket open as we actually have some guys from storage foundation actually working in appliance support now and they can get to the bottom of things pretty quickly for you.

 

Thanks

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

smakovits's picture

While I wait on support, the only update I have is my sweet queue size.

10696526325
 

Mark, keep it small right...?

Mark_Solutions's picture

Yep! - Keep it small and keep running it when you can - the other benefit is that after every 2 runs it does some idy up work so keeps your capacity optimised too.

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

smakovits's picture

Brook, question.

 

My move finally completed and I got the dedupe pool off the internal disk.  Now I was wondering about the configuration of the advanced disk.

 

There is almost 7TB on unit_3 and 4.5TB on unit_1 (internal disk) total is 11.5TB of storage.

 

The question I have is about the best way to configure the advanced disk.  Do I want to go 11TB (largest integer) and have it on the internal and external disk.  Or do i want to go 4TB on the internal disk?  Or 6TB on the external disk?

 

My thought was to put it onto the external disk only to maximize all speeds.  As noted, this keeps all appliance disk transfers at a theoretical 6GB/s, right?

Brook Humphrey's picture

Well if you have the space you can put it only on the external and yes you will get the faster disk speeds. 

I'm not sure as I have not heard of performance issues with the advanced disk even on the slower drives but yes it would perform faster still on hte external drives. 

If you do span it over both drives then it will run at 3GB/s but it will not effect the performance of the dedupe pool.

If you have any further questions let me know.

Thanks

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

smakovits's picture

OK, I will see if the 6GB of spare external disk is enough.

 

Now, I know you said queue size does not matter, but I am thinking I am to a point it might

 

39944977415
 

thats roughly 38TB.  so essentially I am not able to process what comes in fast enough.

 

 

Brook Humphrey's picture

ah queue size is refering to the number of tlogs to process. It's not directly related to your storage size.

When you move the dedupe to the external drive your queue processing also moves to the external drives. So it will speed this up.

 

If you have any other questions let me know

Thanks

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

smakovits's picture

OK, I did get all the dedupe pool moved off of the internal disk to the external.  I also have a 6TB advanced disk pool on the External disk.

 

So, now the question is, what is the best way to make it so the appliance is not pulling double duty all day running dupe jobs for 90+ hours.

To give you a little more back ground on this, we are using vault jobs, so when things complete, the vault kicks in and runs the dupe.  I was curious if this is good bad or indifferent.  We are a shop that did nothing but tape till like now call it, so the dupes is how we did things.  One tape here and a dupe tape off site.

Obviously this changes slightly with the appliance and more so with 2, but that is not here now so I must deal with what I have.

Is it better to use the vault job to run the dupe or instead use SLPs?  I have 2 thoughts on the SLPs.  the first is the way you noted, go to advanced disk, then use SLP to write to dedupe and tape and expire the image, but obviously 300 some VMs will not fit into 6TB, so I could be limited.

The second was to do things like we are now, but instead of using the vault job to do the duplicaiton to tape, use SLP.  While I am still learning about SLPs, it is my understanding that every 300 seconds it checks for completed jobs and as systems finish their backup, it will dupe it to tape.  so I can have numerous slp jobs for all my vms that are backed up with a single policy.  As opposed to the way I believe the vault job works, waiting for the entire policy to complete before starting, meaning it has a lot of data to now dupe before backups start again...My thought behind this, if I understand it right is that dupes are running simltaneously with the backup jobs as opposed to waiting till they are all done.

Is my thinking correct here?

 

Brook Humphrey's picture

well here is the thing. It depends on your multiplexing, number of drives you have, etc. But unless you need a catalog of hte offsite images then vault has absolutley no advantage. SLP's are automatic and yes it would allow you to run multiple duplications as once instead of one at a time. Depending on your settings they do start showrlty after the backup completes.

If you still need to catalog your offsite stuff then you can simply set up your slp's to do all the duplications and then set up vault to eject the tapes. You get the best of both then.

 

Thanks and let me know if you need anything else.

Brook Humphrey
Managed Backup Service

Principal Backup Administration Specialist

smakovits's picture

Thanks Brook.  Still messing with things to get things to an acceptable level.  Currently I moved all dev and test machines to data domain appliance to reduce the work done processing the data while trying to hydrate and dupe off during backups and de-duping...

Currently things are just running poorly and I may need to engage Symantec, but I also attribute a lot of the issue to the dupes going to tape adding load.

The issue with Multi-plexing to tape is just that it will kill us during our DR testing because instead of the requirement for 1 tape at a time, we will need several for a single server which could certainly lead to delays.  However, being these are mostly VMs right now, I guess I wouldnt have to rely on tapes too much at DR, so testing some multiplexing might be a good thing.

Stay tuned as I continue to work through some of the pains.

Mark_Solutions's picture

I have just implemented an MSDP solution and followed the tuning given in this tech note for part of it:

http://www.symantec.com/docs/TECH189585

Most of the setting are already there in 7.5 but I have set the number of data buffers for disk and tape to 128 (tape size at 256kb and disk at 1MB - fragment size of 5000MB for disk stu) so the only real change i had to make has been changing the PrefetchThreadNum in the contentrouter.cfg from 1 to 4

These setting have produced really good rates though it has only had a week running so far so may slow down i guess but the difference in duplication speed that the contentrouter.cfg setting made was immediately obvious (it does need a re-start to register this setting change)

Hope this helps further

 

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

smakovits's picture

For sure Mark, I will have a look.  What sort of rates are you seeing?  And just so I am clear, you ar duping to tape off the appliance right?

Mark_Solutions's picture

I thought they were LTO4 drives but I am guessing they are LTO5 - these are Windows Media Servers rather than appliances but the same MSDP under the hood - was getting about 20 - 30 MB/s to each of 2 tape drives initially - after the changes I have seen sustained rates of over 200MB/s (hence i think they must be LTO5!!)

Not always that fast and varies depending on what else the servers are doing but when tested it just had duplications to do and the difference was very clear

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

smakovits's picture

Put a value of 262144 into the <install path>\NetBackup\db\config\SIZE_DATA_BUFFERS file.

 

Mine shows 1048576, which is smaller, so would I want to change this value as noted?

smakovits's picture

(tape size at 256kb and disk at 1MB - fragment size of 5000MB for disk stu)

 

Also, how do you specify these settings?  I just have the one, Put a value of 128 into the <install path>\NetBackup\db\config\NUMBER_DATA_BUFFERS file.

smakovits's picture

Mark,

I am also working with our support on this issue as a whole and today they provided me with this and thought to ask what your thoughts were on all of it as it contains a few extra changes beyond the ones you have noted and the most common ones in lots of articles.

echo 262144 > /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS

echo 256 > /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS

echo 1048576 > /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_DISK

echo 512 > /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS_DISK

echo 262144 > /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_FT

echo 16 > /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS_FT

echo 128 > /usr/openv/netbackup/db/config/CD_NUMBER_DATA_BUFFERS

echo 524288 > /usr/openv/netbackup/db/config/CD_SIZE_DATA_BUFFERS

touch /usr/openv/netbackup/db/config/CD_WHOLE_IMAGE_COPY

echo 180 > /usr/openv/netbackup/db/config/CD_UPDATE_INTERVAL

echo 1500 > /usr/openv/netbackup/db/config/OST_CD_BUSY_RETRY_LIMIT

echo 1048576 > /usr/openv/netbackup/NET_BUFFER_SZ

echo 1048576 > /usr/openv/netbackup/NET_BUFFER_SZ_REST

echo NUM_DB_BROWSE_CONNECTIONS=40 > /usr/openv/var/global/emm.conf

echo NUM_DB_CONNECTIONS=41 >> /usr/openv/var/global/emm.conf

echo NUM_ORB_THREADS=51 >> /usr/openv/var/global/emm.conf

 

The master server config, manual changes to server.conf

-gn 40 -c 1280M -ch 3G -cl 1280M

Mark_Solutions's picture

Size and number data buffers for tape and disk are probably OK - but to use a new tape buffer size you sometimes need to bplabel your tapes - take a look at the bptm logs to see what sizes they are actually using

If the tape size has always been 262144 then you will be OK, but if they were a different size you may need to label tapes before using them as when NetBackup loads a tape it reads the header first and not only identifies the media id but also assesses the block size that tape uses - so could still be using an old block size - the bptm log will confirm the size being used.

The Fibre Transport buffers I guess are OK if you use FT

The other setting look OK, though some I have never used.

The server config I usually try and use a ch value of 4G

Let us know how it goes (talk about one change at a time! LOL!)

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

smakovits's picture

So, 4G instead of 3G?  What is this setting anyway?  I only used it as a suggestion from support.

 

I know my dupe to tape issue is not resolved.  I am running a job, 7TB, 30%, 34 Hours.  Well, if I use your number, 200MB/s and go safe, to 150MB/s, I calculate this to 13 hours...obviously not good.

 

So, I wonder, is my dupe performance directly impacted by active backups?  My thought is yes, but I just want to be clear.  I assume, everything is impacted when the system is trying to write and dedupe lots of data, while at the same time it is trying to read and hydrate data out to tape. 

At the same time, my write performance is garbage at the moment too...3-7 MB/s so something is really wrong.

As suggested by some, I think I am ready to throw in the towel and contact our sales staff to get Symantec on site.  I feel while the information here on connect is awesome and have learned a ton, I think they should obviously know way more than me and be able to test and do stuff much more efficiently. 

 

 

 

 

 

Mark_Solutions's picture

Lots of tuning possible but every environment is different and what helps for one site may hinder another!

The ch value is the maximum cache value - everything explained here : http://www.symantec.com/docs/HOWTO34665

Running backups and duplications at the same time is not ever going to be great - which i assume is why in 7.6 SLPs get a schedule window and others use a scheduler to suspend and enable them so that they don't run while backups are running.

Queue processing, garbage collection and rebasing can all also have an affect and this is another juggling act - again people are adding the extra queue processing runs to the crontab to keep it running lean

The readme for 2.5.3 is interesting ...

Better battery backup monitoring, catalog busy error fixes, well .. when you read all of the content through it does make you feel like you have to have it!

I guess now that 2.5.3 is here that Support / Symantec will say try that first?

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

smakovits's picture

I will take a look at the link and try to educate myself a bit.

 

as for 2.5.3, we did install it onto our secondary appliance since it is finally plugged in and they added some nice features such as editting the buffer settings from the web GUI.  My question would be do those overwrite the custom files I created or does it use the same files and modify the values or is it neither and instead my manually created files will still overwrite these settings?

Mark_Solutions's picture

Any changes you make should just edit your existing BUFFER files.

What you do need to watch is that if you set a Tape Data Buffer Size or Number it will also use that for disk unless you add disk ones specifically - a little strange and may be resolved in 2.5.3.

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.