Video Screencast Help

Netbackup Appliance Slow Rehydration to tape and bad backup performance

Created: 28 Aug 2014 • Updated: 29 Aug 2014 | 12 comments

Help! Netbackup Appliances 5230 2.6.0.2.  One of our appliances houses ms-standard windows all local drives backups and vmware snapshot backups.  Backups run at good speeds however as soon as we start our slps to copy the data from the msdp to tape the backups get extremely slow to barely moving and the copy to tape is slow.  As soon as we kill off the slps backup speed goes back up again.  We have a case opened and backline engineers have taken a look and we were told that is expected as the rehydration process tapeouts is slowing things down.  We turned our max concurrent write drives to 4 and set our slp schedule to daytime to not affect nighttime backups however with these settings we will never get the month-end to tape done.  

We thought of a couple other options

1.  backup to advanced disk then slp to msdp and or tape-issue here is we would need to buy more trays but even more important I think we would lose some features in Netbackup.  last I checked netbackup accelerator doesnt work to advanced disk(I need to double-check)

2. inline tape copies-during month-end use inline tape copies to backup to msdp and tape, issue is speed will be run as the slowest backup between the two but that might be okay if the speed is good.

 

Does anyone have any suggestions on this and been through this?  Thanks.

Operating Systems:

Comments 12 CommentsJump to latest comment

SymTerry's picture

Rehydration to tape was my first thought too.

You correct on accelerator with advanced disk. The Accelerator feature is not compatible with the following backup options.

  • Storage units of type Basic Disk, Advanced Disk, or Tape.
  • Client side encryption or compression.
  • Backup Selections that specify a UNC path are supported in NB 7.5.0.1 and higher.
  • Mapped drives are compatible, but are not enabled for NTFS change journal.

A storage unit used by the policy must support optimized synthetic backups; NetBackup PureDisk (MSDP, PureDisk, or appliance), Cloud storage plug-ins, and OpenStorage (OST) vendor qualified devices (see the NetBackup Hardware Compatibility List in the related articles or details). 

This is kind of a toss up between accelerated backups or speedy duplications to tape. You will have to decide what you favor more.  IMHO ADV pool is the right answer, backups get done then they can tape-out to their leisure. However, you might not have enough disk to do that. In these environments they just us the dedupe pools to satisfy restore requests and Tape offsite for retention requirements.

in a pitch for the 5330, there is a lot more disk and much faster so rehydration should be better when slp to tape.

bmaro's picture

Thanks SymTerry, one other important thing I forgot to mention is that we also are going to be required to use AIR to replicate images to an offsite location(from the appliance to an appliance at another location).  Last I checked that is not supported with advanced disk.  Is that so and if that is the case any other suggestions?  And can you clarify your last line please about 5330?  Thanks.

Riaan.Badenhorst's picture

Hi,

 

Yes, you need dedupe disk for AIR, so advanced disk can't be used.

 

 

Regards,

Riaan Badenhorst

You need an OpenVision to see the truth about Backups. Restores are a plus. But that's just Semantics ;)

ITs easy :)

Mark_Solutions's picture

Do you just see the issue with one appliance?

On the 5220 appliances it used to be a sign that the RAID battery had  failed but as you have a 5230 it doesn't have one!

A failed disk can also slow them right down so worth doing a full hardware check.

The other main things that makes them go slow are:

1.  when the processqueue has got really large - trimming it down by keeping it running regularly can really help

2. When it is more than 80% full .. best to always try not to let it get more than 80% full if possible to maximise performance.

The only other thing i have seen cause it is when people do vmware backups with multiple paths and it causes a memory leak and orphaned files that fill up the system drive .. but as you are on 2.6.0.2 that shouldnt really be happening either..

Maybe some of the above will help...

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

bmaro's picture

Thanks Riaan and Mark.....

Our queue size looks like this:total queue size : 456871

creation date of oldest tlog : Tue Sep  2 00:37:12 2014

 

We checked hardware/disk and all looks good.  Symantec was on our appliance several days looking around and basically pointing to duplication to tape but that leaves us having to make a decision between writing to advanced disk so we can get the month-end tapeouts done and giving up features like air, netbackup accelerator or not being able to complete the tapeouts.  Both kinda stink.

So far this is the only appliance we are seeing this on.  This appliance does our vmware snpashots and is also the host of the backup along with regular ms-standard backups.  As soon as duplication to tape starts performance goes right down the drain as soon as dup to tape is killed performance goes back to normal.  We're not seeing on any of our other appliances such as ones where we run db2/sql and also log backups.  The appliance having issues is 45% used(45tb).  The other appliances not having issues are about 80% used.  The memory leak sounds interesting, we have received some oncall calls of hours that /disk is 280% full which is odd and then when we login and  check /disk is showing 64% full.  Thanks for all of the help.

 

Mark_Solutions's picture

What data buffer sizes do you use on this appliance?

What fragment size on its dedupe disk storage unit?

Are these different to your other appliances?

Have you included the header disk in your de-dupe volume? If so move it off so it is all on the shelf - that will help! (what i mean is you have 4TB of disk available on the main appliance which can get blended into the de-dupe pool to make it as big as possible .. but if you use that it will slow your de-dupe pool down)

Hope this helps

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

bmaro's picture

Thanks Mark, the header disk is not part of dedup volume on any of our appliances.  All the data buffer sizes, fragment size were the same but we have since changed them trying to fix this problem(was having problem when all the same).  Below is what they are currently.

 

Appliance having issue,

Puredisk settings:

max frag size 51,200

max concurrent jobs 30

NUMBER_DATA_BUFFERS 128

NUMBER_DATA_BUFFERS_DISK 512

NUMBER_DATA_BUFFERS_RESTORE 512

 

SIZE_DATA_BUFFERS 1048576

tape unit being written to

max concurrent write drives-1   (this was 4 then 2, trying to get a sweet spot but with such low setting taking forever)

Reduce frag size to 25,600

 

Thanks.

Mark_Solutions's picture

WOW .. SIZE_DATA_BUFFERS at 1048576 .. never tried that .. I always use it for disk but never go above 262144 for tape... and the numbers are pretty high too....

What is your size data buffers for disk?

I would try 1048576 for disk and 262144 for tape both with 64 number buffers and with a 5000MB fragment size for the dedupe disk storage unit .... run a backup with those settings and see how it duplicates to tape (try it with a test policy and a test volume pool so that you can use a new previosuly unused tape to ensure it takes the change in data buffer size.

See how that goes..

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

bmaro's picture

Thanks Mark,

I will give that a try.  Below is what they currently are from the CLISH menu.

 

btsdesym0.Settings> NetBackup DataBuffers Number Show
name:btsdesym0:
NUMBER_DATA_BUFFERS                               : 128
NUMBER_DATA_BUFFERS_DISK                          : 512
NUMBER_DATA_BUFFERS_FT                            : 16(Default)
NUMBER_DATA_BUFFERS_RESTORE                       : 512
btsdesym0.Settings> NetBackup DataBuffers Size Show
name:btsdesym0:
SIZE_DATA_BUFFERS                                 : 1048576 B
SIZE_DATA_BUFFERS_DISK                            : 262144 B(Default)
SIZE_DATA_BUFFERS_FT                              : 262144 B(Default)
SIZE_DATA_BUFFERS_MULTCOPY                        : 262144 B(Default)
SIZE_DATA_BUFFERS_NDMP                            : 262144 B(Default)
 

 

Mark_Solutions's picture

OK - looks all wrong to me .. try the below and you can later try increasing the number but not the size .... you will need to use new tapes or relabel empty ones to get them to actually use the changed data buffer size.

NUMBER_DATA_BUFFERS                               : 64
NUMBER_DATA_BUFFERS_DISK                     : 64

SIZE_DATA_BUFFERS                                 : 262144
SIZE_DATA_BUFFERS_DISK                       : 1048576

#EDIT# Looks like someone confused the tape and disk sizes .. but numbers do look very high also

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

bmaro's picture

Great thanks Mark appreciate it.  Symantec backline were the ones that set those, they had told us that they are now recommending higher numbers but I'm going to try your settings and will let you know how it goes.  Thanks again.

skiern's picture

Hi bmaro,

 

We have the exact same issue with our Netbackup Appliances 5230 2.6.0.3. Did it help changing those settings?

Our buffers are set with the following values:

 

NUMBER_DATA_BUFFERS                               : 256
NUMBER_DATA_BUFFERS_DISK                     : 512

SIZE_DATA_BUFFERS                                 : 262144
SIZE_DATA_BUFFERS_DISK                       : 1048576