Netbackup 5200 Duplication Jobs Fail with Error 191, then keep re-spawning and failing
We are having problems with duplication jobs on our Netbackup 5200 appliance. We use storage lifecycle policies to backup to the 5200 as the primary copy, and then duplicate off to tape. This has worked fine for months, but recently the duplication jobs have started failing with error 191. They start OK and backup a few GB of data, but then fail before the end (the primary copy backup jobs to the appliance complete OK). Looking more closely at the job log reveals the following:
Critical bpdm(pid=14456) sts_read_image failed: error 2060017 system call failed
Critical bpdm(pid=14456) image read failed: error 2060017: system call failed
Error bpdm(pid=14456) cannot read image from disk, Invalid argument
I originally thought this was some kind of corruption on the appliance, but the strange this is that we can still backup to the appliance OK, and even restore from some of the images that are failing to duplicate. I've adjusted the PoolUsageMaximum and PagedPoolSize parameters as suggested on some sites without success.
Another problem is that when the duplication jobs fail they re-spawn again, so we have numerous duplication jobs running that are hogging our tape drives which is having a knock-on effect on normal backups to tape. We have temporarily suspended duplication to help with this, but it isn't a long term fix.
Our environment comprises: NBU Master Server 22.214.171.124 (Win2003 R2), NBU Media Sever 126.96.36.199 (Win2003 R2), NBU 5200 Appliance (2.0.2)
I've got a ticket open with Symantec about this and they are currently investigating, but just wondered if anyone else in the community had ever seen this before. Any help would be much appreciated.