Replication jobs hang for hours in ‘running’ state

Article:TECH148701  |  Created: 2011-01-21  |  Updated: 2012-08-01  |  Article URL http://www.symantec.com/docs/TECH148701
Article Type
Technical Solution

Product(s)

Issue



Backups are running fine but replication jobs hang for hours in 'running' state


Error



 In the bpdm log, found:

<32> bp_sts_open_target_server: sts_open_target_server failed: error 2060014
<32> copy_backup: open target server failed: error 2060014
 


Cause



 

If you choose to continue using PDDO replication, then you cannot use Optimize Duplication. However if you choose to perform Optimize Duplication, it is necessary to pull the previous PDDO Replication information out of the PureDisk Database.

Run the following syntax to verify if you have multiple [R]PDDO replication dataselections on each SPA.
/opt/pddb/bin/psql -U pddb ca -c "SELECT * from dataselection where name like '%PDDO%'"

The above query will like all PDDO Dataselections:
id | originaltemplateid | agentid | dstypeid | groupid | name | description | dataselection | fullvolume | volumeid | inherited | version | lastcaptureddate | expdirty | disabled | deleted | isfullsystemds | protected | isevroot | creationdate | moddate | sizeonsource | sizeonsourcelastversion
----+--------------------+---------+----------+---------+------------------+-------------+---------------+------------+----------+-----------+---------+------------------+----------+----------+---------+----------------+-----------+----------+--------------+------------+--------------+-------------------------
2 | | 4 | 9 | 141 | PDDO | | C:\tmp|*|I | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0 | 0 | 1261413834 | 1261413834 | 0 | 0
3 | | 4 | 9 | 143 | [R] PDDO (stp10) | | | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0 | 0 | 1261428556 | 1261428556 | 0 | 0
4 | | 4 | 9 | 183 | [R] PDDO (stp10) | | | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0 | 0 | 1261818010 | 1261818010 | 0 | 0
5 | | 4 | 9 | 185 | [R] PDDO (stp10) | | | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0 | 0 | 1261818015 | 1261818015 | 0 | 0
6 | | 4 | 9 | 195 | [R] PDDO (stp10) | | | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0 | 0 | 1263027621 | 1263027621 | 0 | 0
7 | | 4 | 9 | 197 | [R] PDDO (stp10) | | | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0 | 0 | 1263027627 | 1263027627 | 0 | 0
(6 rows)
 


Solution



 

1. On each SPA, stop cron.
# service cron stop

2. On all NetBackup master servers, set the PureDisk related disk pools to a down state:
nbdevconfig -changestate -stype PureDisk -dp <pddo_diskpool> -state DOWN

nbdevconfig -updatests -storage_server <puredisk-hostname> -media_server <mediaserver-hostname> -stype PureDisk

3. On each SPA node, dump the relevant CA tables we'll be editing for safety backup purposes (be sure to dump these to a volume w/enough space):
# /opt/pddb/bin/pg_dump -U pddb ca -t agentmirror -f /Storage/tmp/rpddo_clnup/agentmirror.sql
# /opt/pddb/bin/pg_dump -U pddb ca -t forward -f /Storage/tmp/rpddo_clnup/forward.sql
# /opt/pddb/bin/pg_dump -U pddb ca -t replicateddataselection -f /Storage/tmp/rpddo_clnup/rplds.sql
# /opt/pddb/bin/pg_dump -U pddb ca -t replicatedagent -f /Storage/tmp/rpddo_clnup/replicatedagent.sql
# /opt/pddb/bin/pg_dump -U pddb ca -t subscription -f /Storage/tmp/rpddo_clnup/subscription.sql

On each MBE (be sure to dump these to a volume w/enough space):
# mkdir /Storage/tmp/rpddo_clnup
# /opt/pddb/bin/pg_dump -U pddb mb -f /Storage/tmp/rpddo_clnup/mb.sql

NOTE: you should 'tail' the .sql files to ensure they dumps completed.

4. Delete all contents from the [R]PDDO dsids (run this on the mbe node that actually hosts the [R]PDDO dsids):
# /opt/pddb/bin/psql -U pddb mb -c "DELETE from ds_raw_3"
# /opt/pddb/bin/psql -U pddb mb -c "DELETE from ds_raw_4"
# /opt/pddb/bin/psql -U pddb mb -c "DELETE from ds_raw_5"
# /opt/pddb/bin/psql -U pddb mb -c "DELETE from ds_raw_6"

5. Open the web based UI on each SPA, and delete the [R]PDDO dsids

6. Delete the data from the CR's for each [R]PDDO dsid (run this on each CR in the particular storagepool you're working on):
Run this on all CR for that PureDisk Environment:
# /opt/pdcr/bin/dsiddel -i 1 -d 3,4,5,6

7. On each SPA, delete contents of ca's agentmirror, forward, replicateddataselection, replicatedagent tables via:
# /opt/pddb/bin/psql -U pddb ca -c "DELETE from agentmirror"
# /opt/pddb/bin/psql -U pddb ca -c "DELETE from forward"
# /opt/pddb/bin/psql -U pddb ca -c "DELETE from replicatedagent"
# /opt/pddb/bin/psql -U pddb ca -c "DELETE from replicateddataselection"

8. ....when you get a chance run the 'Data Selection Removal' workflow
 




Article URL http://www.symantec.com/docs/TECH148701


Terms of use for this information are found in Legal Notices