This issue needs a solution.

EMC NDMP backup issue

Created: 07 Mar 2013 • Updated: 07 Mar 2013
Login to vote
0 0 Votes

hi,

I came across a EMC NDMP backup issue and this is the first time I configure EMC NDMP backup.

NBU version on master : 7.5.0.3 running on Linx(CentOS release 5.7 (Final))

NBU version on media serer: 7.5.0.3 running on Linx(CentOS release 5.7 (Final))

we have multi-IPs on both maser and media server; please look as below; the IP of EMC data mover we use is 10.119.35.13Q;

[root@lx0034nbumast ~]# ifconfig -a |grep 10 |grep addr
inet addr:10.119.9.CC Bcast:10.119.9.255 Mask:255.255.255.0
inet addr:10.119.10.AB Bcast:10.119.10.255 Mask:255.255.255.0

[root@lx0034nbumed01 lx0034nbumast_phd_bkups]# ifconfig -a |grep 10 |grep addr
          inet addr:10.119.9.DD  Bcast:10.119.9.255  Mask:255.255.255.0         
          inet addr:10.119.10.XY  Bcast:10.119.10.255  Mask:255.255.255.0
         

we want the EMC storage talk to our backup servers via  10.119.10.XX IP ;; so the network guys already helps open 10000 bidirectionally ;

now the NDMP config verification are good on both master and media servers;
[root@lx0034nbumed01 ~]# /usr/openv/volmgr/bin/tpautoconf -verify  10.119.35.13Q
Connecting to host "10.119.35.139" as user "ndmp"...
Waiting for connect notification message...
Opening session--attempting with NDMP protocol version 4...
Opening session--successful with NDMP protocol version 4
  host supports TEXT authentication
  host supports MD5 authentication
Getting MD5 challenge from host...
Logging in using MD5 method...
Host info is:
  host name "server_2"
  os type "DartOS"
  os version "EMC File Server.T.7.1.55.3"
  host id "abc1997"
Login was successful
Host supports LOCAL backup/restore
Host supports 3-way backup/restore

 

but NDMP backup always failed. I don't know what I miss.

could you help me ?

Since I want the EMC storage talking to our backup servers via 10.119.10.XX IP ; but the IP bounded to FQDN of backups servers are 10.119.9.XX.....

so how do I make sure that the EMC storage indeed communicate with backup servers via 10.119.10.XX, not 10.119.9.XX ?

Do  miss any step on EMC storage?

==================================================

2013-3-5 23:33:43 - Info nbjm (pid=26743) starting backup job (jobid=1222996) for client 10.119.35.139, policy lascx4_phd_a_stg01, schedule Full
2013-3-5 23:33:43 - Info nbjm (pid=26743) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=1222996, request id:{2C994AA0-8630-11E2-AED1-FBB1D04F8B49})
2013-3-5 23:33:43 - requesting resource lx0034nbumed01_phd_dd670_rdsu01
2013-3-5 23:33:43 - requesting resource lx0034nbumast.NBU_CLIENT.MAXJOBS.10.119.35.139
2013-3-5 23:33:43 - requesting resource lx0034nbumast.NBU_POLICY.MAXJOBS.lascx4_phd_a_stg01
2013-3-5 23:33:44 - Info bpbrm (pid=8162) 10.119.35.139 is the host to backup data from
2013-3-5 23:33:44 - Info bpbrm (pid=8162) reading file list from client
2013-3-5 23:33:44 - Info bpbrm (pid=8162) starting ndmpagent on client
2013-3-5 23:33:44 - Info ndmpagent (pid=8164) Backup started
2013-3-5 23:33:44 - Info bpbrm (pid=8162) bptm pid: 8165
2013-3-5 23:33:44 - Info bptm (pid=8165) start
2013-3-5 23:33:44 - granted resource  lx0034nbumast.NBU_CLIENT.MAXJOBS.10.119.35.139
2013-3-5 23:33:44 - granted resource  lx0034nbumast.NBU_POLICY.MAXJOBS.lascx4_phd_a_stg01
2013-3-5 23:33:44 - granted resource  MediaID=@aaaam;Path=/dd670-uswc02/backup/non_pci/repl_wcdc/lx0034nbumast_phd_bkups;MediaServer=lx0034nbumed01
2013-3-5 23:33:44 - granted resource  lx0034nbumed01_phd_dd670_rdsu01
2013-3-5 23:33:44 - estimated 0 kbytes needed
2013-3-5 23:33:44 - Info nbjm (pid=26743) started backup (backupid=10.119.35.139_1362555224) job for client 10.119.35.139, policy lascx4_phd_a_stg01, schedule Full on storage unit lx0034nbumed01_phd_dd670_rdsu01
2013-3-5 23:33:44 - started process bpbrm (pid=8162)
2013-3-5 23:33:44 - connecting
2013-3-5 23:33:44 - connected; connect time: 0:00:00
2013-3-5 23:33:45 - Info bptm (pid=8165) using 30 data buffers
2013-3-5 23:33:45 - Info bptm (pid=8165) using 262144 data buffer size
2013-3-5 23:33:46 - Info bptm (pid=8165) start backup
2013-3-5 23:33:46 - begin writing
2013-3-6 0:33:47 - Error bpbrm (pid=8162) socket read failed: errno = 62 - Timer expired
2013-3-6 1:33:47 - Error bpbrm (pid=8162) socket read failed: errno = 62 - Timer expired
2013-3-6 1:33:47 - Error bptm (pid=8165) media manager exiting because bpbrm is no longer active
termination requested by administrator  (150)
 

 

 

 

Filed Under

Comments

Nagalla
Certified
Certified
07
Mar
2013

hi, to enable to backups for

hi,

to enable to backups for the IP 10.119.10.XX, 

1) you sould enable to proper route

2) name resolution should ponit the right IP.

why you would like to use the 10.119.10.XX, to communicate with NDMP, does it backup LAN?

what are the FQDN associate with the IPS 10.119.10.XX? ( are they not refferting the backup Name of the server)?

07
Mar
2013

hi, Nagalla thanks for your

hi, Nagalla

thanks for your reply

1) you sould enable to proper route

   host side: I add one route as below

   Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
  10.119.35.139   10.119.10.1     255.255.255.255 UGH       0 0          0 eth4

   EMC side:[nasadmin@lascxcss01 ~]$ server_route server_2 -list
  .............
host 10.119.10.99 10.119.35.139 255.255.255.255 int_nfs
host 10.119.10.98 10.119.35.139 255.255.255.255 int_nfs

 

2) name resolution should ponit the right IP.

why you would like to use the 10.119.10.XX, to communicate with NDMP, does it backup LAN?

because the default IP 10.119.9.XX(which is bounded to FQDN name) is management IP.

3) what are the FQDN associate with the IPS 10.119.10.XX? ( are they not refferting the backup Name of the server)?

no they are not..........

Nagalla
Certified
Certified
07
Mar
2013

if 10.119.9.XX is managemnet

if 10.119.9.XX is managemnet IP , what is the name associated for the IP 10.119.10.XX, that is the Name you sould use for the backup configuartion.

Okay..

check  if you are able to ping the IP 10.119.10.XX from EMC box.

if yes, use the hosts entries in EMC box to map the FQDN to the 10.119.10.XX

like in hosts file of EMC shoudl have below.

10.119.10.XX    FDQN of master 

10.119.10.XX    FDQN of  media

07
Mar
2013

yes, I can ping 10.119.10.XX

yes, I can ping 10.119.10.XX from EMC box.

I added the 2 entries below to the /etc/hosts of EMC control station.

10.119.10.XX lx0034nbumed01.active.tan lx0034nbumed01
10.119.10.YY lx0034nbumast.active.tan lx0034nbumast

 

but I am not sure if it works!.

since /etc/hosts is residing on EMC control station, but we want to talk the IP of data mover, not control station.

10
Mar
2013

since the verification

since the verification "/usr/openv/volmgr/bin/tpautoconf -verify  10.119.35.139" on both the master and media servers are good , So i think the failure may not caused by network.

but still not sure what's the possible reason..........................

11
Mar
2013

Can you verify / modify the

  • Can you verify / modify the NAS path begin with e.g. /vol/.../folders_path

 

2013-3-5 23:33:44 - granted resource  MediaID=@aaaam;Path=/dd670-uswc02/backup/non_pci/repl_wcdc/lx0034nbumast_phd_bkups;MediaServer=lx0034nbumed01
2013-3-5 23:33:44 - granted resource  lx0034nbumed01_phd_dd670_rdsu01
2013-3-5 23:33:44 - estimated 0 kbytes needed