Video Screencast Help

BRBACKUP to Tape - write performance

Created: 28 Feb 2013 | 10 comments

I am seeking to improve write to LT04 performance. As it stands today the backups are working fine with a reasonable write speed, but as our databases are growing its taking longer and longer for the backups to finish.

 

In the below example, I'll attempt to explain the design..

 

We have a ~ 300GB database which takes around 1 hr to backup to fiber connected ULTRIUM TLD4 tape drives. The backup is run by backint (brbackup) from the media server.

Below are the current parameters we use..

DRIVES = 16 (in initSID.utl file which means 16 streams)

MPX = 4  (in the NBU Policy which means 4 streams per tape/drive)

NET_BUFFER_SZ = 262144

NUMBER_DATA_BUFFERS = 12

SIZE_DATA_BUFFERS = 65536

Fragment size on the LT04 drive = 1048575

 

I have tried to tweak some of the above paramters like N_D_B to 16 and S_D_B to 1048575, also tried to increase the number of streams per drive and tried to throw more drives at the backup by decreasing the MPX, but nothing seems to be making any difference in the overall backup time. Although I have seen 3 out of 16 streams goes much faster if I tweak the parameters but it don't matter because the time is determined by the slowest stream.

I am looking for ideas from the experts here, I ultimately have to apply this solution to a 3TB (ever growing) database which takes around 8 hours to backup to tape. Please let me know if you need anymore information.

Appreciate your response in advance..

Operating Systems:

Comments 10 CommentsJump to latest comment

jim dalton's picture

Youve not mentioned how the drives and the data are connected...directly, via switch, on another master/media server? Nor how the data exist on disk...ie what lies behind the storage. Nor the performance of the server when this is happening. More detail please!

JIm

urnbux's picture

The disks are powerpath BCVs on symmetrix and they are connected directly via brocade and so are the drives on the same media servers

 

 

# lslpp -L |grep -i emc
  EMC.Symmetrix.aix.rte      5.3.0.5    C     F    EMC Symmetrix AIX Support
  EMC.Symmetrix.fcp.rte      5.3.0.5    C     F    EMC Symmetrix FCP Support
  EMCpower.base              5.3.1.1    C     F    PowerPath Base Driver and
  EMCpower.encryption        5.3.1.1    C     F    PowerPath Encryption with RSA
  EMCpower.migration_enabler
  EMCpower.mpx               5.3.1.1    C     F    PowerPath Multi_Pathing
  SYMCLI.SYMRECOVER.rte     7.2.0.11    C     F    EMC Solutions Enabler
  devices.common.IBM.modemcfg.data
 
For system performance, I am not very good at analyzing performance, but memory seems to be the bottleneck..please find the attached. And please let me know if I left anything unanswered, much appreciated!. 
 
    
 
 
 
AttachmentSize
perf.docx 154.19 KB
revaroo's picture

What performance are you getting from the slowest stream? What is that stream backing up?

You are achieving 83Mb/s backup speed. That seems pretty good to me.

 

Mark_Solutions's picture

For LTO4 and LTO5 I always use 262144 for the size and 64 for the number (32 if you don't have enough RAM to cope)

You will find you may need to rrelable your tapes to get them to use the new size buffer otherwise they will carry on using the old size buffer as it is written to the header

Hope this helps

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Marianne's picture

Please post bptm log as file attachment.

I am interested in 'waited for full/empty buffers....'

Which database type? If Oracle, there are various Oracle rman performance tuning docs available on the web.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

urnbux's picture

Marianne:

 

Uploaded bptm log..

 

Its Oracle datafiles but not quite rman, we use brbackup to take online backup from source and place it on the BCVs on the media server, then from BCV to the tape

AttachmentSize
log.docx 137.99 KB
Marianne's picture

blush Granny should not post late at night when she's tired....

Of course your subject says 'bprbackup' ...blush

BCV lun layout could be a problem. Or maybe too many devices connected to a single hba?

This is what I got from bptm:

 

01:18:28.174 [7471328] <2> write_backup_completion_stats: waited for full buffer 86520 times, delayed 215521 times
01:18:28.208 [7471328] <4> write_backup_completion_stats: successfully wrote 4 of 4 multiplexed backups, total Kbytes 75175563 at 19246.861 Kbytes/sec
 
01:18:46.350 [4063428] <2> write_backup_completion_stats: waited for full buffer 89433 times, delayed 222827 times
01:18:46.379 [4063428] <4> write_backup_completion_stats: successfully wrote 4 of 4 multiplexed backups, total Kbytes 74457361 at 19095.994 Kbytes/sec
 
01:23:31.399 [4522050] <2> write_backup_completion_stats: waited for full buffer 85733 times, delayed 232719 times
01:23:31.427 [4522050] <4> write_backup_completion_stats: successfully wrote 4 of 4 multiplexed backups, total Kbytes 75052442 at 18071.575 Kbytes/sec
 
01:25:21.832 [10158162] <2> write_backup_completion_stats: waited for full buffer 86726 times, delayed 239529 times
01:25:21.859 [10158162] <4> write_backup_completion_stats: successfully wrote 4 of 4 multiplexed backups, total Kbytes 74300831 at 17494.226 Kbytes/sec
 
No buffer tuning is going to help with this since you have already configured NBU for best performance:

using 262144 value from /usr/openv/netbackup/NET_BUFFER_SZ
setting receive network buffer to 262144 bytes
using 64 value from /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS

I suggest you have a look at lun layout and connectivity...

 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

urnbux's picture

Marianne, first of all appreciate your help, I envy your dedication towards your passion, unbelievable!!

 

I think you are absolutely spot on, we have many bcv devices connected through each hba, of the order of 670 BCVs over 6 hbas, didnt get the exact count per hba but assuming ~100 per hba which could be too much. will talk to our storage person and see what can be done. Again thanks for taking a deep look at that garbled bptm log

urnbux's picture

But to revaroo's point above, can I get any more than 83 MB/s even if  I solve hba vs device puzzle? I also thought that was pretty good but greedy for more if at all possible

Marianne's picture

You are averaging less than 20Mb/sec per tape drive. The large amount of 'waits' tell you that you tape drives can do a lot better. (waited for full buffer 86520 times, delayed 215521 times)

You should be seeing closer to 120Mb/sec per drive. About double that with 2:1 compression.
Problem is that you cannot 'feed' your drives fast enough....

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links