Video Screencast Help
Symantec Appoints Michael A. Brown CEO. Learn more.

SAN client Throughput is very low ..25Mbps

Created: 22 Jan 2013 | 14 comments

Hi

 we have couple of SAN clients which are large due to limited drives we have issue like they are not completing with in 24 hours before next schedule kicks off so today we noticed ( its there for long time :) ) the throughput most of sessions are showing like 25-30MBps

Environment:

Master/media server : Solaris10  [ NB 7.5.0.1 ]

SAN Client  - AIX 6.1  [ NB 7.0 ]

Tape storage: IBM 3584  - LTO4 drives

verified couple of things esp hba/switch port speeds they are

client HBA    - 8GB
switch port   - 8GB  ( client hba connected )
media HBA     - 4GB
switch port   - 4GB  ( Media hba connected )

as per IBM notes LTO-4 should give us 120mbps but we are getting 25 :) can someone help me where to start the troubleshooting the slowness ?
 

 

 

 

 

Discussion Filed Under:

Comments 14 CommentsJump to latest comment

sazz.'s picture

As a starting point you can enable bptm on media server and bpbkar on client to see the delays. If media server is waiting for full buffers means it is still waiting from client to get the data and you might need to check the client or n/w. Mostly client takes time to read the data.

Have you had Multiplexing, multistreaming already enabled?

Please check the below article

http://www.symantec.com/docs/TECH147296

http://www.symantec.com/docs/DOC4483

http://www.symantec.com/docs/TECH1724

 

Andrew Madsen's picture

Startt here:

http://www.symantec.com/business/support/index?page=content&id=HOWTO56052

Pay attention to:

http://www.symantec.com/business/support/index?page=content&id=TECH54778

The above comments are not to be construed as an official stance of the company I work for; hell half the time they are not even an official stance for me.

Raju.la's picture

Hi Wolfsbane/sazz,

Question: i doubted we are at very low firmware on LTO4 drives.. i try to find the current version on 3584 web console but no luck how to get that ? also if anyone has compatability matrix can u pls share

NB master/meida: 7.5.0.1
Tape: IBM 3584 LTO4

Raju.la's picture

forgot to mention.. thank you both for sharing very good info about san client performances

Yasuhisa Ishikawa's picture

TS3584 is a tape library - not a tape drive.
You should browse IBM support page or ask your local IBM support or distributor.

BTW, such low performance is usually came from problems on client side. Try to measure performance by "bpbkar -nocont". Be sure to use same data for testing.

http://www.symantec.com/docs/HOWTO56131

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

Raju.la's picture

testing the client performance rightnow..let u know the results

Raju.la's picture

client & NB manual backup Test Results:

Client test ( Null backup )
-------------------------------------
Test fs size: 227GB
backup time on client : 2hrs:3mins [ 32mb/sec ]

09:39:46.711 [14549042] <4> bpbkar main: real locales
11:43:05.718 [14549042] <4> bpbkar main: INF - Client completed sending data for backup
11:43:05.718 [14549042] <2> bpbkar main: INF - Total Size:578961408 + 227GB

NB Manual backup
--------------------------
Test fs size: 227GB
backup time on client : still running [ but i noted its 31mb/sec now ]

Client bpbkar:

i don't see any wait or delay during this test manual backups

SAN Client LOGS:

root@sanclient:/usr/openv/netbackup/logs/bpbkar$ cat log.020113 | grep wait
12:19:58.669 [14549090] <4> bpbkar initialize: INF - SHM_WAIT_DELAY set to <1000> microseconds, max_times_wait_before_kill to <5000>

root@sanclient:/usr/openv/netbackup/logs/bpbkar$ cat log.020113 | grep buffer
12:19:58.659 [14549090] <2> logparams: bpbkar -r 5356800 -ru root -dt 0 -to 0 -bpstart_time 0 -clnt sanclient -class xx_AIX_SAPPRD -sched Full -st FULL -bpstart_to 300 -bpend_to 300 -read_to 9600 -ckpt_time 900 -blks_per_buffer 511 -stream_count 1 -stream_number 1 -jobgrpid 2041864 -use_otm -use_ofb -b sanclient_1359749792 -kl 7 -fso -shmfat

Media server logs:

bptm log:

11:42:45.072 [5618] <2> io_init: bpbrm_shm_id = 822083698, buffer address = 0xffffffff78700000
11:42:45.073 [5618] <2> io_init: using 262144 data buffer size
11:42:45.117 [5618] <2> io_init: using 32 data buffers
11:42:45.117 [5618] <2> io_init: USING 262144 data buffer size for FT
11:42:45.658 [5618] <2> mpx_setup_shm: buffer control shared memory address is ffffffff77e00000, size is 3104, shmid is 654311539
11:44:46.494 [5618] <2> write_data: completed writing backup header, start writing data when first buffer is available, copy 1
11:44:46.496 [5618] <2> write_data: received first buffer (262144 bytes), begin writing data
11:50:11.697 [5618] <2> write_data: completed writing backup header, start writing data when first buffer is available, copy 1

Current settings on Media servers:

NET_BUFFER_SZ 262144
NUMBER_DATA_BUFFERS_FT 32
NUMBER_DATA_BUFFERS_DISK 32
NUMBER_DATA_BUFFERS 32
SIZE_DATA_BUFFERS 262144

As i don't see any wait or delay so can i increase the buffer settings for FT to get little better performacne
i really apprecite if someone help me here

Marianne's picture

Client test ( Null backup )
-------------------------------------
Test fs size: 227GB
backup time on client : 2hrs:3mins [ 32mb/sec ]

 

This is the speed at which data is read from disk on the client. 
No NBU tuning can change this.

"wait" and "delay" entries are only recorded in bptm log at the end of the job.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Andrew Madsen's picture

Silly question but are you using the same set of HBAs to go to your storage as well as your tape library?

The above comments are not to be construed as an official stance of the company I work for; hell half the time they are not even an official stance for me.

Raju.la's picture

No..not at all disk HBAs , tape library and FT hbas are different

Andrew Madsen's picture

So the next question is "What are you backing up"? I understand it is 227GB but is is a bunch of little files?

The above comments are not to be construed as an official stance of the company I work for; hell half the time they are not even an official stance for me.

Andrew Madsen's picture

Try hard setting the switch ports to 4Gbps that go to both the HBA and the tape library. I have seen many times that the switch would renegotiate the data rate time and again thus disrupting the data transfer.

How many other clients are using the tape library at the same time? How many ports do you have involved on the tape library? Are you using LTO4 tapes or did someone slip in some LTO3 tapes to use them up? The SAP database file are they large or small? Is it Oracle or DB2 for the Database? If it is Oracle what is the filesperset value? I cannot remember the equivalent for DB2.

The above comments are not to be construed as an official stance of the company I work for; hell half the time they are not even an official stance for me.

Raju.la's picture

sorry for dealy ..

"Try hard setting the switch ports to 4Gbps"

hmm..i can't do this bec its a NPIV port so few other disk hbas are associted with this so it might cause some other issues for them.

"How many other clients are using the tape library at the same time? "

not same all the time scheduels are round the clock but we moved all of them to disk except these san clients on tape now

" How many ports do you have involved on the tape library?  "

2 from media server [ one path from each fabric 7 drives , total of 15 drives]

"Are you using LTO4 tapes or did someone slip in some LTO3 tapes to use them up? "

NO LTO3 at all

"The SAP database file are they large or small? "

large files all SAP