Catalog Backups Status Code 1
Netbackup Enterprise Server 7.1.04, with 4 Media Servers at the same level. Entirely Windows environment, w/ Server 2003 Master Server.
For some time now, my Catalog backups end in a Status Code 1. It's basically always the 3rd backup of the 3 that run as part of the Catalog backup. I haven't really been too concerned since it appears that they are basically successful (although perhaps i should have been). But now I want to upgrade to 7.5, so I want to have a completely successful Catalog backup before migrating.
Some stuff I read mentioned file access or I/O, so I tried the catalog backup w/ no other jobs running, and still had the same problem.
Today I enabled bpbkar logging, and started a Full backup (I'll attach a partial copy of the log). I found several entries that are all basically the same, referring to specific images (mostly older policies or clients that are no longer part of an existing policy). Here's an example:
9:24:05.101 AM: [5568.7904] <16> V_CatBackupWin::V_ExpandCatalogBackupDirective: ERR - db_getImgBackupList failed, err: 12. Client: acct-srvr-img, Image ID: LAN-CLIENTS-NIGHTLY_1177744582_FULL
9:24:05.101 AM: [5568.7904] <16> V_CatBackupWin::V_StartCatBackupDirective: ERR - Unable to expand hot catalog backup directive: CATALOG_BACKUP acct-srvr-img LAN-CLIENTS-NIGHTLY_1177744582_FULL
9:24:05.101 AM: [5568.7904] <4> V_CatBackupWin::V_NextCatBackupDirective: INF - Error processing hot catalog backup item. Advancing to next item
Another
9:24:05.101 AM: [5568.7904] <16> V_CatBackupWin::V_ExpandCatalogBackupDirective: ERR - db_getImgBackupList failed, err: 12. Client: acct-srvr-img, Image ID: LAN-CLIENTS-NIGHTLY_1177744582_FULL
9:24:05.101 AM: [5568.7904] <16> V_CatBackupWin::V_StartCatBackupDirective: ERR - Unable to expand hot catalog backup directive: CATALOG_BACKUP acct-srvr-img LAN-CLIENTS-NIGHTLY_1177744582_FULL
9:24:05.101 AM: [5568.7904] <4> V_CatBackupWin::V_NextCatBackupDirective: INF - Error processing hot catalog backup item. Advancing to next item
So, my questions are:
- What exactly is the problem, and can it be fixed?
- If it can be fixed, how?
- If it can't, what if any impact will that have on my Catalog backups if I had to restore, and for the upgrade to 7.5?
Comments 14 Comments • Jump to latest comment
Some T/N:
http://www.symantec.com/docs/TECH86295
http://www.symantec.com/docs/TECH152577
Both T/N indicate a defect Image as the root cause.
Assumption is the mother of all mess ups.
If this post answered your'e qustion - Please mark as a soloution.
I've not opened the TNs posted by Nicolai, but as he mentions a defective image I image they suggest to run bpdbm -consistency 2.
You could also check the bpdbm log, this will probably show the files that are being problematic.
When header files are 'corrupted' they can sometimes be manually repaied, but ofteen the only solution is to delete them (and the corresponding .f file) Sometime they are compleyly corrupted /unreadable but if you can see the FRAG line the 9th field shows the media - so there is the option to run a phase 1 / 2 import and recreate the catalog information for these images.
Martin
Thanks guys. You'll have to forgive me if this is a dumb question, but does if i delete the files per the to two TN articles, does that remove the image from Netbackup as well?
I ask because they are "yearly" backups from years past, so I don't want to lose them if I can help it. Each would have 2 copies (one Offsite).
Those ARE the image files.
Please locate this file in your post - LAN-CLIENTS-NIGHTLY_1177744582_FULL
copy the file to LAN-CLIENTS-NIGHTLY_1177744582_FULL.txt and post as File attachment.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Perhaps it's semantics, but I realize that those files have some bearing, but the data from the backup still lives on tape (in my case), so it's still not clear to me what effect deleting said files will have, which is why I asked first....
I've attached the file that you requested.
If you delete the files, NBU will not know what is on the tape. You remove the ability to 'search'/view the files in the BAR gui. I think you will also create a catalog inconsistancy (an NBCC issue), as the number of images on the tape in the image db will not match what is in the media db. I said before to reimport the missing images, but I can't think if this would fix the issue, or if it would simply +1 image to the number of images in the image db and therefore still leave you with a mis-match. Hmm ... would have to test this to be sure. You can try and expire it using bpexpdate -d 0 -backupid acct-srvr-img_1177744582
If the file is damaged this might fail, but worth ago, as it will automatically update the media db, and so the number of images showing on the tape (in the media db) will be correct. Expiring the complete tape might work but do not do this , as if the tape contains images that have spanned from other tapes, it will expire these as well I think. Having just said that if you run bpimmedia on the media id, then if only one image shows up, then you could do this.
Even if an inconsistency is introduced, it's not a big deal, it can be sorted out with NBCC, or just left, as he tape will still expire when it gets to it's expire time - I can't see that any data loss would happen.
I'll have to x2 check on the above notes, but what can be safely said ...
1. If you delete the image + the .f file the tape will be expired from the media db if there are no other images on the tape when the cleanup job runs (every 12 hrs by default) therefore just physically write protect this tape first.
2. If bpimmedia shows only one image, then delete the files, expire the tape (bpexpdate -d 0 -m <media id> ) - before you do this, create a unused volume pool and straight after expiring the tape, move it to this unusued pool (stops it being reused if it isn't write protected) and then you can take you time reimporting it.
martin
I am curious to see output of 'bpimagelist -backupid acct-srvr-img_1177744582 -L'
The header file that you have posted above looks fine. There may be something wrong with .f file (LAN-CLIENTS-NIGHTLY_1177744582_FULL.f)?
PS: do you have catalog compression enabled?
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
I do see one issue in the catalog file ... despite all of the fragments one of the last lines says this:
ESTIMATED_KBYTES 0
yet near the top it says this:
KBYTES 32295310
Also, if you add up the copy 1 fragments it comes to 12210979 but if you add up the copy 2 fragments it comes to 16084331
Adding those two together comes to 28295310
I also see that the original job re-tired 6 times so looks like it had issues
As it looks like copy 1 and copy 2 still exist it may be that you have the opportunity to try and do something with this image - perhaps try duplicating another copy from the copy 1?
Things do not seem to look right though!
Hope thsi helps
Authorised Symantec Consultant
Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.
Well, I'm no expert on Catalog stuff by any means, so I just figured I'd TRY a couple of things on a whim:
Marianne, attached is the bpimagelist. The only compression setting with which I'm familiar is in the Policy settings, and those are greyed out. But most of the regular backup Policies had Compression enabled.
This looks to perhaps be the problem.
Can you post up the file :
...netbackup\db\images\wtp-isd\1177000000\LAN-CLIENTS-NIGHTLY_1177744583_FULL
Martin
Interestingly enough, it's backing up successfully with a Status of 0, and those errors are gone from the BPBKAR logs. I didn't do anything new after my last post.
However, the first child job is now ending with a Status of 1. The error in the details is:
Error bpdbm(pid=6388) error staging NBAZDB backup to C:\Program Files\VERITAS\NetBackupDB\staging
Haven't had any luck yet, if anyone has any ideas.
Still getting the same behavior: Original problem ceased, but now the first child job, says "Successful" but has a status of 1.
The errors in the Job Details are:
10/30/2012 12:10:07 PM - Info bpdbm(pid=6504) staging relational database files for catalog backup
10/30/2012 12:10:07 PM - Info bpdbm(pid=6504) staging BMRDB backup to C:\Program Files\VERITAS\NetBackupDB\staging
10/30/2012 12:10:08 PM - Error bpdbm(pid=6504) error staging BMRDB backup to C:\Program Files\VERITAS\NetBackupDB\staging
10/30/2012 12:10:08 PM - Error bpdbm(pid=6504) error staging NBAZDB backup to C:\Program Files\VERITAS\NetBackupDB\staging
10/30/2012 12:10:08 PM - Info bpdbm(pid=6504) staging NBDB backup to C:\Program Files\VERITAS\NetBackupDB\staging
10/30/2012 12:11:07 PM - Info bpdbm(pid=6504) done staging NBDB backup to C:\Program Files\VERITAS\NetBackupDB\staging
the requested operation was partially successful(1)
From digging around through tech notes, etc, I looked at the vxdbms.conf (see attached) and noticed those two databases don't have file paths. Is that right? Anyone have any input? Is this even a problem?
Also attached is the databases.conf. and the BPDBM log (you can see the exact times above but the errors were around 12:10PM).
FYI. I ended up opening a support ticket. There still hasn't been too much progress. although they seem to think that both databases somehow got partially configured and don't need to be there. We were able to remove the BMRDB from vxdbms.conf, which made one of the two errors go away. The normal steps to removed NBAZDB were not successful so we are still working on that.
I'll update this when I have more info.
Finally resolved the NBAZDB error, described above.
We finally ended up replacing the nbazdb.db.template and repeating the steps described in repair/recreation of NBAZDB, and it finally worked. The thought was that the template file was corrupt so it just kept causing the same problem, regardless of which steps we took.
Would you like to reply?
Login or Register to post your comment.