Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Server Space Management

Created: 27 Jan 2009 • Updated: 19 Sep 2010 | 30 comments

We have the server space management job scheduled to run every morning at 2:30. But, our recovery stays full. Any ideas as to why this would happen? I there a way to run the SSM job manually to try to clear up storage space?

Comments 30 CommentsJump to latest comment

TGiles's picture

Marvin,



The SSM job run consists of 4 pieces. Depending on how much data is backed up and how much has to be updated by this job it could take anywhere from a couple of hours to several days to complete. Have you examined your Application Event logs to confirm that the SSM job is completing within the 24 hours you've provided it?



Remember that if a server job is running while another server job is scheduled to start the new job will stop the currently running job.



Now assuming you are using Recovery Solution 6.2 SP2 then you can simply right-click on your Cluster in the Altiris Console. From there you can choose Recovery Solution Tasks->Start Server Jobs->Server Space Management



HTH

KSchroeder's picture

Tylor,

Under what circumstances will RS stop a running job to start another one? I've been using RS here for over 4 years and have never encountered this behavior. Whenever I've tried to start another server job, I get an error that another server job is already running. Do you need a multi-node RS cluster to see this behavior, or is this specific to the Local Recovery function?



Marvin: What do you have your Compaction settings set for? Have you applied Hotfix 14 (or 19) for RS 6.2 SP2? Are you seeing that no physical disk space is freed up, or just that the cluster space used information never clears? As Tylor said SSM can take many hours to complete (one of our boxes takes over a week to complete, which is obviously a problem!). Does your storage have NTFS compression enabled?

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

PolyMorphiC's picture

Try this...

Make a ConnectionString in the registry - This gonna make you able to run a checkstore and advanced blob handling.

KB42397 to see if the storage location is counted right

KB41091 to correct the reports "cluster used disk space"

KB40822

Run the query to force a refresh of BLOBs at the next SSM.

Mbordelon's picture

Kyle,





Compaction is set to Compact compressed files. Minimum sapce to reclaim = 50% of data file size. I'm not sure how to tell if hotfix 14 or 19 has been applied, is this in add remove programs or somewhere in the Altiris console? Free physical space for that drive is above 200GB. The storage management tab shows 195GB maximum and 195GB used. The drive used for storage does NOT have compression enabled.

KSchroeder's picture

To check for the Hotfix version, check the C:\Program Files\ALtiris\Recovery Solution\Server folder. View Details on the folder, and add the "File Version" and "Product Version" columns to the view. The version should show 6.2.2760.14 for a hotfix 14 .DLL or .EXE file. There are also several other SQL scripts that need to be run for some of the hotfixes. Also please note that RS 6.2 SP3 is now available and would be a quicker update if you're behind.



https://kb.altiris.com/article.asp?article=36188&p=1 has all the SP2 hotfixes.

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Mbordelon's picture

I did this and all files are listed as Product version 6.2 Service Pack 3.

KSchroeder's picture

On the Cluster's Events tab, please add some of the following events, in particular:



0x40040025 SSM Job summary

0x40160009 Data file compaction job...

0x4016000B Data file compaction completed successfully...



And any others that are related to SSM, Deletion, or Compaction. You may want to start running the SSM job earlier, like at 9:00 PM. Maybe the SSM isn't completing before your users start taking new snapshots the next day, and somehow that is impacting the space reclamation. Also, ensure your BLOBs are excluded from Antivirus scanning and any backup jobs, which can lock the files and cause them to fail to be compacted.

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Mbordelon's picture

I was avle to add the 2nd and 3rd events you have listed as well as a few others that were related to deletion or compaction. However, I don't see an event 0x40040025 SSM Job Summary.



As far as the scheduling goes. Would it be better to schedule every other day or maybe just once a week?

Mbordelon's picture

I manually started the SSM job yesterday. The available storage is still at 0%. Is there a process that would be running if this job was running? Is there anyway to tell that the job is actually running?

KSchroeder's picture

Marvin,

Sure. Run the following queries against the RS database (AeXRSDatabase):



SELECT cast(GetDate() as nvarchar(20)) AS [Current Date/Time], count (*) AS [Total Blocks], cast(sum(cast(BlockLength as numeric (20)))/(1024*1024) as numeric(20,2)) AS [MB to Delete] from DeletedBlobs



SELECT count (distinct FileKey) AS 'Dis. Filekeys', count (*) as 'total revisions' FROM Revision WHERE DeleteFlag = 1



Run this again after 10-15 minutes and the numbers should decrease for all the values.



You mentioned that your maximum and used space are the same (195GB). If there is 200GB of free space available, you may want to increase the maximum size. Having the used space equal to the free space may be causing trouble.

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Mbordelon's picture

I am attaching a screen shot of what I see on the storage management tab. I may be confusing you by wording my statement about the free/used space wrong.



Also, I ran the query twice about 20 minutes apart. The number actually increased.

KSchroeder's picture

The numbers could increase depending on which phase of SSM is running. In simple terms, the first phases of SSM find files/file blocks that need to be deleted and add them to the DeletedBlobs table, then the next phase actually starts touching those BLOB files and removing the extra data from them. The last phase, called Compaction, finds any BLOBs that are more than X% "empty" (per the Compaction number you set) and combine the data in those BLOBs into new BLOBs, then remove the old ones.



Your screenshot shows the Maximum and In Use space to be almost equivalent. You need to allocate additional space to the BLOBs by editing the storage volume you have (D:\CRSData) and increasing the Maximum usage value by at least 10 or 15 GB. If you have 200GB free, increase it by 50GB at least. RS uses that restriction as law; even though you may have 200GB of additional space, it won't use it as the total usage would exceed the maximum you have set. So, compaction (at a minimum) is never completing and you never recover any space. Also make sure that your \AeXCRTmp directory (by default on the C: drive) is excluded from virus scanning, and has at least 4GB free space.

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Mbordelon's picture

Well, I tried to go back into the configuration to change the maximum space and now none of the selections below "Recovery Solution Cluster configuration" folder will open at all. I'm not sure what to do from here.

KSchroeder's picture

What happens when you try to access the cluster properties on the NS? Do you get an error, just a blank window, etc? Is the RS server maxed out on CPU? Check the NS's LogViewer.exe while accessing to see if you get any errors. Try stopping the Altiris Service and then run iisreset from Start/Run, then start Altiris again (or bounce the box if you can get away with that in your environment).

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Mbordelon's picture

Apparently there was something in the database that was locked and preventing other commands to run. As far as the original issue goes. I increased the maximum by 50gb and the space management job is scheduled to run starting at 2:30 tomorrow morning. I will see Monday if this changes anything.

KSchroeder's picture

Marvin,

Any luck?

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Mbordelon's picture

Well, I got an email yesterday saying that the "data file compaction job completed successfully" and an email saying that the "space management job started" Nothing saying whether the space management job completed or not. Looking at the cluster configuration, after increasing the maximum by 50gb, I have less than 1gb of free space. I seems like what ever I set the maximum to, it just gets filled up.

KSchroeder's picture

Well, what is the free physical space looking like? You may want to decrease the number on the Compaction threshold to a lower number (the default is 50%, try turning it down to 30).



Also, what is your storage? Is it on a NAS or SAN device, or local internal hard drive/disk array on the server itself? It could be some issue with Sparse file system utilization where BLOB files which are mostly "empty" of data are not properly being managed by NTFS sparsing. This has been known to be a problem on some NAS devices, particularly non-Windows devices.

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

stobler's picture

Hi KSchroeder,

i have very similar problems like Mbordelon.

Our envoirement is alittle bigger, to give you a impression. Here some details

RS 6.3 SP3 with 700 User (9 TB)
Storage is split it  in 2x 1.5 TB Diskshelf on the Server, and 4.5 TB & 1.5 TB on a NAS.
Each drive has 50 GB left (We limit the amount of data in Alitirs Storage location, with limit capacity to xxx MB)

My Problem is, only the "Integrity Check" work. (This take 8 day's to complete)

Currently our server is going out of space and our top priority is to "clean" the storage from "old & obsolete data"
If i look in RS then i can see, that there are 50 user, were are marked for delete.
If i try to use "Delete marked Item" this Job quite without any messages.

Our Users can do Backup, as normal only the server jobs will not be accepted.
IF try to start the server jobs during night, when no other users are doeing Snapshot, it wont start.

I have tried all the things which i have seen in that thread, but it wont really work.
special the hint from was my favorite:

SELECT cast(GetDate() as nvarchar(20)) AS [Current Date/Time], count (*) AS [Total Blocks], cast(sum(cast(BlockLength as numeric (20)))/(1024*1024) as numeric(20,2)) AS [MB to Delete] from DeletedBlobs  SELECT count (distinct FileKey) AS 'Dis. Filekeys', count (*) as 'total revisions' FROM Revision WHERE DeleteFlag = 1    (i replaced [MB to Delete] with [1024])
But this hint which sound verry good, won't work ...

So do you have a procedure, to clean up the storage, step by step ?

At the moment i'm real restless, and my pain is to delete the hole 9 TB ...

KSchroeder's picture

Stobler,
You haven't made any registry changes to the "RetentionSettingsEx" value on the RS server have you?  This can impact what functions in SSM occur. Are the BLOBs being scanned by Antivirus software?  If so, you should exclude BLOB*.DAT, as well as C:\AeXCRTmp (or wherever you defined the temporary file rebuild path, which should have several GB of free space ideally).  What do you have your retention settings configured for ("Delete files after X days", "Delete files X days after deleted from hard drive")?  Are there any errors in the log when SSM completes?  Did you add the SSM job messages to the RS Events tab on the Cluster properties?  Are your storage locations compressed with NTFS compression??  Here are several KB articles which may help if you have not seen them already:

https://kb.altiris.com/article.asp?article=38543&p=1
https://kb.altiris.com/article.asp?article=29428&p=1
https://kb.altiris.com/article.asp?article=41182&p=1
https://kb.altiris.com/article.asp?article=41309&p=1
https://kb.altiris.com/article.asp?article=35053&p=1
https://kb.altiris.com/article.asp?article=17812&p=1
https://kb.altiris.com/article.asp?article=17814&p=1

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

stobler's picture

Thanks for your answers.
- Registry Changes: No we did not modify them.
- Altiris storage’s are excluded from Antivirus scanner.
- Retention are : 3 month and on deleted files 30 day after delete them on client
- SSM did not start or complete since more then a 5 month’s
- NTFS compression is disabled, so only Altiris did handle that.
- Server Job Schedule -> Storage compaction -> min. space to reclaim 30% (# 35053)

Some of the KB’s are very useful (#38543 & #41309 after them, our “Delete mark Item” job is running… ;-)

I will update as soon the delete item job is finished…

KSchroeder's picture

OK, you may want to reconsider the 3 months option...that means that for every file that changes, you are keeping 3 months worth of revisions to that file.  I don't know about your users, but most of mine want the most recent backup, or the one from a day or two before.  You may greatly decrease your space usage by modifying that down to even 1 month.  This is particularly true if you use MS Outlook and have large numbers of .PST files (BTW, be sure to add *.OST as an exclusion if it isn't already; that is the "Offline Store" for a user's Exchange mailbox).  You may want to review the "Largest Files" report also to check for other large files/file extensions that are taking up a lot of space on your RS.

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

stobler's picture

First of all, Thanks ;-) "Delte marked Item", is still runing, and spend me over 500 GB ! !
I'm fully aware about the consequenze, to have a 3 month retention.
But this is a decicion by the business.
I pointed out. I keep you updated, when the Jobs are finished or want not run.
current status - in progress -

KSchroeder's picture

OK good to hear.  Also a new Hotfix for RS 6.2 SP3 was released yesterday:
https://kb.altiris.com/article.asp?article=47276&p=1

This has a fix for SSM hanging; perhaps that is part of the problem?

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

KSchroeder's picture

@stobler,
Any news?

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

stobler's picture

Our Integrity check take 8 day's to complete. After that long time, the other Server Jobs can run and complet successful.
So we are able to do snapshot and restore data as normal. And we get some space back.

About all, I would say our server is running in a good status.

Thanks for the fast and good competence answers

KSchroeder's picture

@stobler,
Can you open a new thread for your question and post a link to it here?  MBordelon was the original poster and you have kind of high-jacked that thread.  We can continue troubleshooting in the new thread.  I will tell you though that the "old" BLOBs are perfectly normal and expected.  I can go into more detail on a new thread.

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

stobler's picture

A new thread is created.
called "How can i remove obsolete Blob files"

additional i removed the question in my last anser, so that the original thread have a closer topic.

any way, thanks for the good, fast and competence answers.

KSchroeder's picture

Marvin,
Did you ever get this sorted out?

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.