Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Upgraded from 7.1.0.1 to 7.5.06 - Prev images not found in BAR

Created: 09 Jul 2013 • Updated: 20 Jul 2013 | 17 comments
This issue has been solved. See solution.

Hi,

With the help of symantec technical support, we upgraded our microsoft clustered master server from 7.1.0.1 to 7.5.0.6.

The upgrade process was far from being a breeze even wtih technical support webex set up.

Just a summary of the issue while upgrading.

  1. Start upgrading my OpCenter to 7.5 which is installed on 1 of my media servers - This hung at the db upgrade stage for over 2-3 hrs, tech support told me to end task and log a case with the opscenter team.
  2. Proceeded to upgrade my master servers - The setup.exe crashed at 73%. However, tech support went and started the services and launched admin console. All seems to be working. I suggested if we could do a repair install to make sure all is well but tech support said to create a test backup job and if that works, it should be fine. I'm still skeptical about this but just went ahead as recommended as it was almost time for my backup jobs to start running.  After upgrading my active node, we proceeded to upgrade the passive node. The setup.exe again crashed at 73%. Tech support said to just proceed.
  3. Patched both active and passive nodes to 7.5.0.6 with no issues.
  4. Phase 1 and Phase 2 seems to have completed without errors. I have abt 30,000 images.

Now, my main issue here. I was browsing my backup images and realized that i can only see backups done last night. All my previous backups could not be found.

I have the catalog backup of before the upgrade but would like to know if there is anything else we can do without setting up a 7.1 from scratch and recovering the catalog.

I dont suppose there's an option of restore just the image db from my 7.1 catalog backup to the current 7.5.0.6 setup?

Overall, the upgrade has been a nightmare for me. Was on the line with tech support for almost 6 hours straight..

Operating Systems:

Comments 17 CommentsJump to latest comment

Nicolai's picture

Before panic - The GUI is exposed to the names you are using for browising backup. srv1 - srv1.acme.com or SRV1 are not the same servers. Maybe the upgrade changed default name.

Check prvious backup with this command utility:

bpimagelist -d 06/01/2013  -U (or -L or -l) 

You might find this T/N usefull : 

http://www.symantec.com/docs/TECH91133

http://www.symantec.com/docs/HOWTO43661

Update:

Try this from the master server

 bpclimagelist -client {CLINET_NAME_FROM_POLICY}  -d 01/01/2013

should output somthing like:

Backed Up         Expires       Files      KB     C Sched Type   Policy
----------------  ---------- -------- ----------- - ------------ ------------
07/08/2013 16:10  07/15/2013   100000   102450016 N Full Backup  GEN_DATA
07/07/2013 16:10  07/14/2013   100000   102450016 N Full Backup  GEN_DATA
07/06/2013 16:10  07/13/2013   100000   102450016 N Full Backup  GEN_DATA
07/04/2013 16:10  07/11/2013   100000   102450016 N Full Backup  GEN_DATA
07/03/2013 16:10  07/10/2013   100000   102450016 N Full Backup  GEN_DATA

 

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

Jon.K's picture

I did a bpimagelist and only images from yesterday are listed..

Same with bpclimagelist

 

Gautier Leblanc's picture

Are you sure that your backups metadata have been migrated into relational database (NBEMM) ?

Have you an "image cleaning" job after 1st NetBackup 7.5 start ? What is his status ? Logs say that images are migrated into NBEMM ?

You can also look in the <NBU Install dir>\db\images if you have only .lck and .f files..

Jon.K's picture

Yeah, here's the last bits of the cleanup log.

7/8/2013 5:52:04 PM - Info bpdbm(pid=4208) [000:05:35] Overall progress: 37370 images imported, 0 skipped, 0 corrupt. Import rate = 111 images/sec
7/8/2013 5:52:04 PM - Info bpdbm(pid=4208) [000:05:35] Initiating import for client: scene-main
7/8/2013 5:52:04 PM - Info bpdbm(pid=4208) [000:05:35] Finished importing images for client: scene-main with 13 imported, 0 skipped, 0 corrupt.
7/8/2013 5:52:04 PM - Info bpdbm(pid=4208) [000:05:35] Overall progress: 37383 images imported, 0 skipped, 0 corrupt. Import rate = 111 images/sec
7/8/2013 5:52:04 PM - Info bpdbm(pid=4208) Finished importing all images into the database. (Count = 37383)
7/8/2013 5:52:04 PM - Info bpdbm(pid=4208) Cleaning up tables in the relational database
7/8/2013 5:52:04 PM - Warning bpdbm(pid=4208) Hot catalog backup is not configured for 'nbu.xxx.grp', catalog cleanup will return partial success until hot catalog backup is configured.
7/8/2013 5:52:04 PM - Info bpdbm(pid=4208) deleting images which expire before Mon Jul 08 17:46:28 2013 (1373276788)
7/8/2013 5:52:47 PM - Info bpdbm(pid=4208) deleted 496 expired records, compressed 0, tir removed 0, deleted 467 expired copies
the requested operation was partially successful(1)

The job was successfully completed, but some files may have been
busy or unaccessible. See the problems report or the client's logs for more details.

Gautier Leblanc's picture

 

Idiot question (but...) : Are you sure that your NetBackup Master service (or IP) is on the node that have the storage owning database files ?

Mark_Solutions's picture

The OpsCenter upgrade doesnt surprise me - i have seen the database upgrade part run for several hours, especially if you use analytics and have a lot of data in OpsCenter - they should have left that to run!

Do all of your cluster resources appear in Cluster manager?

The upgrade tends to delete and recreate them all - just wondering if during the hang something was missed - like a drive that held some of your catalogs - and on that thought .... a cluster would use a shared drive for the catalogs - say Y drive ... if the installation hung then maybe it is writing the catalogs to C:\Program files\veritas now and you just have a configuration / registry entry that is incorrect

Take a look to see if last nights backups are in the wrong place - that may be easily fixable!

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Gautier Leblanc's picture

I agree with Mark. That's why I spoke of storage on wrong node... It sounds like a loss of the storage that contains database.

Jon.K's picture

Storage owning the database files meaning the folder Netbackup\db\ ?

Then yes. There's a dependency created by netbackup installer during cluster setup so the services will follow the shared disk and virtual name.

Jon.K's picture

I dont use analytics.

Yes, my cluster resources appear in the cluster manager. This was also one of the initial issues and the tech support had to run bpclusterutil -c to recreate them.

In my case, the cluster disk is H: and last nights backups do seem to be on the H: drive.

Mark_Solutions's picture

Can i take it all the other backups are also on the H drive?

If so that I guess it is the .f files that have not been migrated in correctly - if that is the case then you are going to need further support with this one and may well end up rolling back and doing a catalog restore.

Can I just check a couple of things with you ...

1. What does the output of nbstlutil -active -backupid NetBackup_0000000001 give?

2. Have you failed the cluster over and re-ran the bpimage -cleanup -allclients again so that it has run on both nodes?

3. What does bpgetconfig LIST_FS_IMAGE_HEADERS show for each node (possibly easier to just check the registry as one node will be passive)

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Jon.K's picture

  1. unknown option -active
  2. So far, i've only ran bpimage -cleanup -allclients on 1 of the nodes, i can only failover and try it on the other node tmr morning since there are jobs running now
  3. On my active node, value is NO. On the passive node, value is YES

Mark_Solutions's picture

I think that there shouldn't be a - in front of active - sorry

2 /3 OK - see how it goes tomorrow - should do the metadata migration again and both should then be NO after that

I do think you may still be in trouble though!!

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Jon.K's picture

1. Operation not successful: no images or lifecycles matching criteria found

I don't understand why it would process 37,000+ images successfully but still screw it up..

My medias all have their correct expiration dates and assignments which means netbackup should know something that's on these tape, but just cant retrieve anything from the catalog.

I don't suppose there are alternatives to setting up 7.1 from scratch, restoring catalog then performing the upgrade and hope it goes through.

Can I setup 7.1 on a standalone server, restore catalog then upgrade to 7.5 and do some sort of export from there?

Victor Pablo's picture

Jon.K.

What is the status for your:

nbemmcmd -listsettings -brief -machinename <master server clustered name> ; look for the status of the:

SLP_DSSU_MIGRATION_STATE and LIST_FS_IMAGE_HEADERS.

has to be similar like this:

SLP_DSSU_MIGRATION_STATE="1"
LIST_FS_IMAGE_HEADERS="0"

If that is different, then your import was failed and need to rerun again.

Check if you can find a file like this:  /usr/openv/netbackup/db.corrupt <--- for corrupt images.

Regarding your question, to recover in a stand alone. It could be possible but your master need to have the same name and configuration, to restore from catalog. Similar like a DR exercise.

 

Hope this help to you

VP

Mark_Solutions's picture

I dont think you could do the restore to a standalone - it really should be clustered to do it

I know we are suggesting a lot on here but you really should be on the case of support - just tell them you want the case "escalted" and insist on some urgent action.

If does sound like you may have to uninstall / re-install / recover the catalog (meaning the last 2 or 3 days of backups have been lost (though you could run media report and keep the tapes used over the last few days safe and them import them once you are all done)

I have the feeling that the NBDB is not right - during an upgrade it gets dumped out, a new one created and then pulled back in - my feeling is that when it hung it didn't pull the old data in and when it ran it next migration step it didn't actually have the full details of all of the old backup images.

Take a look under netbackup\temp\install\nbdb_date\

There should be the nbdb log file relating to when the upgrade too place.

The last line should say

create_nbdb: Exiting with rc = 0

but you do a search for any <16> or <32> lines to see if it went wrong somewhere.

Also make sure your netbackupdb\conf\ databases.conf and server.conf look right - the databases.conf should have the correct paths in it.

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Jon.K's picture

SLP_DSSU_MIGRATION_STATE="1"
LIST_FS_IMAGE_HEADERS="0"

 

Anyway, i have since uninstalled, reinstalled 7.1, restored catalog, and performed the upgrade.

Took a full 8 hrs, luckily this time it managed to import the old backup images correctly.

I think there's a technote with regards to the setup.exe crashing in 2008 R2 and the resolution was to reboot the server, then perform the install again.

So lessons to be learnt.

  1. Reboot your windows server prior to any major upgrade to be safe
  2. Always have a latest catalog backup, i had mine backed up to disk so the restore was relatively smooth.

Thanks for the help guys.

SOLUTION