Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Slow Duplication to Tape in Netbackup 6.5.2

Created: 05 Feb 2009 • Updated: 06 Jul 2010 | 20 comments

I can't get more than 30 MB/s when duplicating from disk to tape.  I've tried duplicating from different DSU's on SAN, NAS, and local storage.  None will yield greater than 30 MB/s.

 

However, I can back up data on the same SAN, NAS, and local storage through the same media server to the same tape drive at 70 - 80 MB/s.  That tells me that the source disk and the tape drive aren't my bottleneck.

 

I don't understand why a backup can be fast but a duplication with the same source and target can be so slow. 

 

I've had a case open with support for months and they haven't been much help.  I'm hoping someone here can help me.

 

Thanks for your time.

Comments 20 CommentsJump to latest comment

sdo's picture

Methinks "NUMBER_DATA_BUFFERS".

 

Search this forum.

Spend a good hour or two reading.

Don't play.

Plan your change.

Plan your test.

Be sure of your expectations.

Plan your roll-back.

De-activate ALL policies before you make your change.

Make one change (at a time) when your system is quiet.

Does your change require an application restart?

Do NOT peform any dups or backups yet!

Perform a medium sized restore (several gig).

Does it still work? No? Roll-back and plan again.

Run one duplication.

Make the new duped image the primary.

Run a restore from the duped image.

Does it still work?  No?  Roll-back and plan again.

Activate your test policy.

Perform a backup with your test policy.

Does it still work? No? Roll-back and plan again.

Perform a restore from that backup.

Does it still work? No? Roll-back and plan again.

 

Any improvements?

Willing to take a risk/chance with your next backup session?  No? Roll-back and re-read all your gathered notes/docs again, and plan again.

 

HTH.

Ryan Greeley's picture

Thanks for the reply sdw303.  Are you saying that an incorrect setting in "NUMBER_DATA_BUFFERS" could cause fast backups but slow duplications?  I'm not sure I understand - both operations read from disk and write to tape.

 

Ryan

sdo's picture

Ooops - I hate being caught out.  ;)

 

So, are you saying that the backup data comes from the same NAS head, same SAN ports, through the same switches, off the same logical volumes, off the same LUNs, off the same parity groups, off the same spindles - as the duplication data?  i.e. Does the data that is backed up reside on the same logical disk (set) as the image to be duplicated?

 

How is your disk storage unit configured?

sdo's picture

DSUs and DSSUs is an area that I'm weak on.

 

I'll try to describe my simple take on the differences (could someone else please correct me - or provide a better explanation..thanks) between the processes that are actually taking place...

 

Your backup:

disk --> bpbkar --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> tape

 

Your duplication:

disk --> bpdm --> SIZE_DATA_BUFFERS_DISK and NUMBER_DATA_BUFFERS_DISK --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> tape

 

 

Though simple, is this correct? Can anyone else comment?

 

So, I can't help but think that the interaction between bpdm and bptm for the duplication job is not optimally tuned, i.e. for the inter-memory transfer of (probably) large blocks/fragments (SIZE_DATA_BUFFERS_DISK) from a disk based image to the (likely) breakup into smaller (SIZE_DATA_BUFFERS) blocks for writing to tape - but it's not just a size thing - the number of buffers for disk and tape will have an effect too.

Ryan Greeley's picture

sdw303 wrote:

Ooops - I hate being caught out.  ;)

 

So, are you saying that the backup data comes from the same NAS head, same SAN ports, through the same switches, off the same logical volumes, off the same LUNs, off the same parity groups, off the same spindles - as the duplication data?  i.e. Does the data that is backed up reside on the same logical disk (set) as the image to be duplicated?

 

How is your disk storage unit configured?

 

Thanks for the reply.  The answer to all of your questions is Yes.  I stood up a dedicated server to test this issue and ran a backup to a DSU on the D: drive.  I then backed up the folder containing the backup image to tape and got 75 MB/s.  Next I duplicated the backup image to tape and only got 25 MB/s.

 

 

Ryan Greeley's picture

sdw303 wrote:

 

Your backup:

disk --> bpbkar --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> tape

 

Your duplication:

disk --> bpdm --> SIZE_DATA_BUFFERS_DISK and NUMBER_DATA_BUFFERS_DISK --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> tape

This is an interesting point - I'll do some tweaking of SIZE_DATA_BUFFERS_DISK and NUMBER_DATA_BUFFERS_DISK and see if it makes any difference.

 

Ryan

sdo's picture

Ryan, before you start tuning - can I suggest that you do some research - like I said, I'm weak on DSU/DSSU and I can be sure that I've understood the concepts correctly.  I'm hoping that I'm close, but I don't know for sure.

 

Please post back with whatever you think might of interest.  I know that I, for one, am keen to understand the differences in more detail between a disk based backup and a disk based duplication.

S.H's picture

Duplication buffer size is

---read side: same size as used for the backup,

---write side: SIZE_DATA_BUFFERS (default : 64K (tape), 256K(disk) ) <-- another Media server  

 

 

---------------------------------------------------------------------------------------

local backup:

[disk] --> bpbkar --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> [tape]

 

**bpbkar read size = SIZE_DATA_BUFFERS

**bptm write size = SIZE_DATA_BUFFERS

 

remote backup and local encryption backup:

[disk] --> bpbkar --(socket)--> bptm --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> [tape]

 

**bpbkar read size = 512k Byte

**bptm write size = SIZE_DATA_BUFFERS

 

duplication:

[disk] --> bpdm --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> [tape]

 

**bptm read size = same size as used for the backup

**bptm write size = same size as used for the backup (Except remote media server)

---------------------------------------------------------------------------------------

 

Duplicate isn't bpbkar, and isn't two data buffers.

 

Perhaps, maybe correct.

Message Edited by S.H on 02-06-2009 03:33 PM
Message Edited by S.H on 02-06-2009 03:36 PM
Message Edited by S.H on 02-06-2009 03:50 PM
sdo's picture

I suppose a traditional client backup might be better put as:

 

remote backup and local encryption backup:

[disk] --> bpbkar --> BUFFER_SIZE --> (socket) --> NET_BUFFER_SZ --> bptm --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> [tape]

 

**bpbkar read size = 512k Byte

**bptm write size = SIZE_DATA_BUFFERS

 

 

And your "duplication" might be better entitled "local DSU to tape duplication":

local DSU to tape duplication:

[disk] --> bpdm --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> [tape]

 

 

 

 

You don't seem to think NUMBER_DATA_BUFFERS_DISK and SIZE_DATA_BUFFERS_DISK has anything to do with bpdm reading the disk image.  Does anyone else know specifically what these parameters are used for?

S.H's picture

Hi sdw303,

 

Buffer size and numbers are decided by type of STU.

 

Ryan's case,

------------------------------------------------------------------------------------

1) local backup to Disk STU:

[disk] --> bpbkar --> SIZE_DATA_BUFFERS_DISK and NUMBER_DATA_BUFFERS_DISK --> bpdm --> [disk]

 

2) local DSU to tape duplication:

[disk] --> bpdm --> [same buffer size as used for the backup / NUMBER_DATA_BUFFERS ] --> bptm --> [tape]

 

And,

 

x)

1) local backup to Tape STU:

[disk] --> bpbkar --> SIZE_DATA_BUFFERS and NUMBER_DATA_BUFFERS --> bptm --> [tape]

------------------------------------------------------------------------------------

 

Message Edited by S.H on 02-06-2009 07:28 PM
Message Edited by S.H on 02-06-2009 07:34 PM
road_warrior's picture

I'm getting a very similar, but worse problem using NBU 6.5.3.  All backups are directed to basic disk storage units on the Windows 2003 master/media server.  Backups to the SAS-attached disk run ~45-60MB/sec.  No backups go initially to tape, period.  We did run preliminary tests backups to tape to verify the h/w config and got speeds around 30-35MB/sec on LTO4 drives.  Not great, but ok.  I expired all test backup images prior to running Vault.  Vault is configured to run duplication from the direct-attached disk stu's to SAN-attached LTO-4 drives and runs at a time when no other jobs can start.  

 

Duplication speed is between 5.6MB/sec and 9.3MB/sec.  I've changed the SIZE_DATA_BUFFERS to 262144 (256K) and NUM_DATA_BUFFERS to 16, 32, and 64, stopping/starting all services before each test, but have only seen minor improvement with each change.  Backups to either disk or tape are much faster than dupes run on the same local server, which is very puzzling.

 

Anyone else duping from disk to tape with 6.5.x and getting good performance? 

road_warrior's picture

We've been troubleshooting this for 2 weeks and still have gotten no improvement in duplication performance.  Did any of you resolve the issues you were having with Vault duplication performance?

 

thanks,

james 

Ryan Greeley's picture

Hi Road_Warrior,

 

No, I haven't resolved my issue yet.  I'm still working with Support but they're not doing much besides asking for the same logfiles over and over again.

 

I hope to have more time to devote to this issue next week.

cbull's picture

I came across this artice about speeding up duplication if using an EMC EDL with an embedded media server. This is what we use and I am about to test if it works.

http://thebackupblog.typepad.com/thebackupblog/200...

shans's picture

Are

You duplicating your images through Vault ????

Stefaan_M's picture

I'm currently facing the following:

1. backups go to primarly to a DSSU (SUN X4500 with ZFS diskpool)
2. duplication with the staging schedule goes well : 120MB/s (disk -> LTO4)
3. duplication with Vault is awfully slow : 5MB/s  Although the source and destination are exactly the same as with the staging

Any ideas what can cause this?

Stefaan.

PS : parameters set: SIZE_DATA_BUFFERS & NUMBER_DATA_BUFFERS

Arella's picture

Hey guys, I am getting a maximum of 120MB/s on all jobs after all said and done. Here is why.

I stayed away from fiber drives. Even though they are cutting edge, i did not see how they can speed up backups faster than SCSI, since SCSI has a theoritical maximum speed greater than FC at this point. Ofcourse, if you have 4Gbps  drives, its a different matter. But, i will wait until some rich company tests and confirms.

I eliminated all older drives like LTO2s. Even though they are handy, i do not believe they have any business sitting on the SCSI chain.

I replaced all scsi cables with the shortest possible runs and tweaked and tested my way for months. And one day, i saw those sweet 120MB/s numbers in my iostat. I was thinking that was only a quarter of what the drives are capable of. So, my next plan to is get a faster server approved and move Netbackup to a newer Sun hardware. I am hoping that will push the through put a little more.

But, take your time tweaking the buffers. And most of all, know what your hardware is capable of. Any good System Admin will figure it out over time.

gc_bus's picture

Hmmm pretty much the same here. I've run tests to Puredisk SU and it goes like the clappers; as do direct multiplexed backups to tape. However, using Netbackup SLP's with backup to disk and dupe to tape, the tape duplications are much slower. I think in this case a lot of it is to do with only being able to single-thread the tape in this configuration. This makes our LTO1 fibre drives "shoe-shine" which slows them down considerably.

Trevor_Jackson's picture

Hi guys,

The answer is indeed in how you are backing up the data.

We have removed all multiplexing and set them to 1 on our backups with large amounts of small files as they run quicker. 

When you backup to DSSU's using multiplexing, it can open multiple streams of data and fully uses the buffers.

A trusted Symantec engineer once told me that the Duplication to tape does not use multiplexing, however this was incorrect.

To test this, you will need two backup policies, one configured to use multiplexing and one not configured to use multiplexing. The latter in this case will duplicate to tape much quicker, as the backup server has less calculations to perform. Imagine your disk images as a large TAR ball. It will need to first extract the tar-ball then re-send to tape using Vault.

The direct to tape uses hardware buffers of the tape drives, hence why this is quicker. Also the metadata created when multiplexing is used is greater, another chunk of processing for the media server.

So my suggestion and I have got this to work in real life. Remove multiplexing from client --> DSSU backups. Add multiplexing to direct to tape backups and for large amounts of data with lots of small files, leave this at around 6 or the restores take a while longer,  again due to the metadata processing. Symantec may not wish to admit this, but the explanation I recieved and the results I obtained is enough for me. 

Now our Vault job times have been reduced by 2/3 and everything is sweet!!!

Trevor

Trevor_Jackson's picture

 So this is what I believe happens:

Client to DSSU
1) Data is backed up in smaller chunks and then metadata is collected to build and image file on DSSU
2) the image files and metadata are created in special Symantec tar-ball format.

DSSU to tape using Vault
1) The vault profile checks the DSSU for images which meet the criteria of the profile.
2) Then it reads each metadata file, obtaining a list of image files
3) It then creates its own metadata to create a list of sizable chunks to split the data into buffers (using SIZE_DATA_BUFFERS & NUMBER_DATA_BUFFERS)
4) Then the backup processes pass the information to the buffers, if they are full it waits (there is no shoe-shining on LTO4 drives) the buffer amounts have to be less than or equal to the amount of buffer space each drive has.
5) if Multiplexing is used then there are too many little chunks, the buffers of the drives fill up and do not empty fast enough for the media server buffers to send the next batch, hence why the delay looks like shoe-shining.
6) now if you configure the buffers correctly for the tape drives, in most cases the disk backups will slow down, as they have more and better buffers. So my advice here is to leave everything as is for your first tests.

sdw303's advice on the testing side of things is completely true, the slightest mistake and you have to remove all settings and start again. So play gently people.

If anyone needs more info, let me know.

Trevor