Video Screencast Help

Questions and issues on basicdisk storage clean up

Created: 15 Sep 2012 • Updated: 22 Jan 2014 | 6 comments
This issue has been solved. See solution.

Hello all,

I'm new to NBU 7.1 and I'm faced with a BasicDisk Storage UNIT full

We are running NBU 7.1.0.3 on a Linux RedHat 2.6 Master server,which is configured with a 7TB BasicDisk storage unit, storing backup images.

Data are backed up only on disk (there is no staging to tape)

At this time, this 7TB filesystem is full. Maximum retention is set to 2 months (some policies have other retention period, ie 6 weeks), High water mark is 98% and Low water mark set to 80%

On this storage unit, I can find some images created 6 months ago. They haven't been cleaned up although images have expired.

When I look at details on

/usr/openv/netbackup/logs/bpdm logs,

I can see the following :

05:38:20.185 [28828] <2> delete_image_disk_sts: Deleting disk header for xxxxxx_1343930400_C1_F1

05:38:20.217 [28828] <2> volume_cleanup_delete_fragment: deletion of xxxxxx_1343930400, copy number 1, fragment number 1, resume number 0 media id /netbackup/das/disk01 succeeded

05:38:21.411 [28828] <2> delete_image_disk_sts: Deleting disk header for xxxxx_1343930401_C1_F1

So it seems task is running well.

My question is :

- why images expired are not cleared ? do I have to delete them, if so how do I have to do this in order to not corrupt catalog or anything else ?

Some images will expire during the week-end, so I'll be able to see next Monday if we still are running out of space on this storage unit

Thank you for your help

David

Discussion Filed Under:

Comments 6 CommentsJump to latest comment

Nicolai's picture

Are the images you suspect to be overdue still active in the Netbackup catalog ?

Do a bpimagelist -backupid hostname_12345678

hostname_12345678 are the files names found on the LUN - do not add the additional part of the filename. 

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

DavidE31's picture

Hello Nicolai and Martin,

Thank you for your messages

I am currently checking the points you suggested this week-end

I ran a script checking images, and for some of them there is no entity :

Pas d'entity pour xxxx-d08.domaine.fr_1331117006

Pas d'entity pour xxxx-d08.domaine.fr_1331124433

Pas d'entity pour xxxx-d08.domaine.fr_1331129054

Pas d'entity pour xxxx-d08.domaine.fr_1331132654

Pas d'entity pour xxxx-d16.domaine.fr_1331118110

Pas d'entity pour xxxx-d16.domaine.fr_1331123413

Pas d'entity pour xxxx-d16.domaine.fr_1331127013

Pas d'entity pour tu-spa-d16.domaine.fr_1331130613

Pas d'entity pour xxxx-d16.domaine.fr_1331134213

Pas d'entity pour yyy-p03_1331132407

Pas d'entity pour yyy-p03_1331132425

Pas d'entity pour zzz-d21.domaine.fr_1331119021

Pas d'entity pour zzz-d21.domaine.fr_1331123414

Pas d'entity pour zzz-d21.domaine.fr_1331130809

This means that I can delete directly from the filesystem, or using nbdelete, with no other action required on NBU ?

 

mph999's picture

If you are not using staging, the high/ low water marks are meaningless.

You have made some excellent investigation so far.

In summary, it works like this, I'm keep.ing this simple as with basic disk there is a detail I would have to test to be sure I am correct.

In nutshell, there are multiple stages to expire images

bpexpdate deletes the images from the image database

nbdelete deletes the actual fragments from the disk.

If images aren't deleted, then either nbdelete is not working, or, something has happened to the image header files.

WIth advanced disk, there is an entry in the EMM db, when the images are expired from the image catalog the fragment is set 'to be deleted' withing the emm db, nbdelete then comes along and deletes the fragments.  This is the detail I am unsure of for basic disk, I can't remember if there is a EMM entry for images when using basic disk.

No matter, for the moment - all this means is that if there is, then this is another area where there could be a problem.  I can test it later to find out.

One simple way to investigate this, is exactly as you are doing, find a recent image that we know should be in good order, and if it is not deleted from the disk, check the log, see if the image header file is still there and check the EMM db (I'll let you know if there is anything in there for basic disk, probably be tomorrow though).If something has happened to the catalog (eg nbcc was run and something was amiss) this could explain why the older fragments were not removed, not saying this was the case, just an example of what could happen.

I get this a lot on my test server - mainly because its a test server and I intentionally do 'stupid' things to it when I need to test/ investigate things.

When I don't mess around with it all images clean up perfectly, when I mess around with header files or the emm db, I find fragments do not get deleted, and I have to do a manual cleanup to get things in order.

I am not suggesting of couse that you have been doing similar, just demonstarting how things can go wrong.

Personally, I would suggest cleaning things up and keeping a close eye on things if it happens again.

I'll check the db later, I have to go out for the day, but will do it as soon as I can.

Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
SOLUTION
mph999's picture

... and as if by magic, Nicolai comes along and suggests things along the lines I am intending to go 

M

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
mph999's picture

OK, test done - image is listed in EMM database, for basic disk.

This makes sense - as the image is deleted from the images DB before nbdelete runs, and nbdelete needs to know which fragments to delete from the storage, and it does this with the information in the EMM database.

Anyhow, why this is important - it gives another area where there could be an issue that causes images not to be deleted from the disk.

Given this info, the sentance in my above post :

"If images aren't deleted, then either nbdelete is not working, or, something has happened to the image header files."

needs to become

"If images aren't deleted, then either nbdelete is not working, or, something has happened to the image header files, or the info for the image in the EMM db"

Therefore, one way of investigating this is to do more or less as you have done:

Find an image that is playing up

1. Determine if the image header file has been deleted

2. Determine if the EMM db info for the image is complete

For 2. I would use one of our internal only scripts which makes the DB a lot more readable, I wouldn't fancy trying to do this on the raw db files so if you do get to this stage, I think I would recommend you to log a call with Symantec.

Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
DavidE31's picture

Hello,

thank you for your investigations on what could have happened on our NBU/DB

I can confirm that images created last July 14th have been deleted from disk due to 2 months retention reached.