Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Intermittent archived item retrieval failure

Created: 16 Feb 2013 • Updated: 05 May 2013 | 18 comments
Francis-T's picture
This issue has been solved. See solution.

Hi,

Have an intermittent issue that seems to be occuring a few times a week, no services stop, but a  restart of the services or server gets things running.

Seems to be 3 Enterprise Vault events (all with a Task Category of Web Application WP) on the EV server that tie in with when archive items start to fail.

In order they are;

AutoStorageOnline error.

Reason: Not enough storage is available to process this command. [0x80070008]

Reference: [CTSOL]

AutoStorageOnline error.

Reason: Not enough storage is available to process this command. [0x80070008]

Reference: [CO]

Unable to fetch item from "EVSERVER1.domain.wan".

Reason: Not enough storage is available to process this command. [0x80070008]

Saveset Id: 201302090437169~201301101307220000~Z~C123407DDFF16F5881C3C3D29B10AAC1

Archive Name: user.name

Archive Folder Path: ?Inbox

Reference: [GOAFS]

We found a couple of KB's that talk about free space, which isn't an issue on this server, and also about the amount of items in %TEMP% locations, but they've been cleared down as well.

Environment is 2 x EV 8.0.5.1048 servers. 1 for the journal and 1 for the user archives, this issue is on the server used for user archives.

OS is 2008 SP2 x64.

There are other issues relating to backups and the time they are taking (2-3 days for a full at present), still an ongoing piece of work to address this.

Exchange is 2007 SP3 RU6 on a 2003 R2 x64 cluster

SQL is 2005 on a 2003 R2 x64 cluster

Beyond the couple of KB's that didn't get us anywhere, now at a bit of a dead end for what to look at next.

Hoping to get a DTrace when trying to retrieve items the next time it happens and look to log a support call as well if I'm no getting anywhere.

Anything else I should be looking at, or anything obvious I've missed?

Cheers

Comments 18 CommentsJump to latest comment

JesusWept3's picture

OK So when its talking about Not Enough Storage Is Available, its really talking not enough memory
So when it happens, open up a quick task manager, have it show handles and it have it show memory too (including non paged pool) and then see if anythings taking excessive amounts of memory or file handles

so for instance, you might just be seeing a slow mapi leak in the archive task, and when you restart it, its clearing that memory and such which is why it starts working again

Now, if it does happen to be a memory leak, its very doubtful you will get a hotfix, and will probably be given a workaround, such as restarting the tasks once a day, or restartallmapitasks type registry keys to try and stem the problem.

Francis-T's picture

You're quite right. 

I had looked at system handles at one point and forgotten about it completely. 

About 2 hrs after a services restart, a process called something like storageopenns.exe (forgotten the name and not in a position to check at the minute unfortunately) was up around 16000 handles. I'd normally be suspicious of anything over 5000 so did mean to look at it closer. 

I googled it at the time and it was a process called during the retrieval process, meant to check further if it was a likely cause or just a symptom during the backup window. 

Francis-T's picture

Also compared the same process on another site and it was down around 600, but outside of their backup window. 

Francis-T's picture

Caved in and logged on to check, process is StorageOnlineOpns.exe (32bit).

It's up to 23000 handles now & 2946K non-paged pool memory.

Gave the archiving task a quick restart, but didn't affect it, I'd guess going by the name I need to restart the storage service, but thats going to impact the backup window which possibly has another 2 days to run!

JesusWept3's picture

hmmm so storageonlineopns is used for opening items from the vault
things that will invoke it are:
PST Exports through the VAC
PST Exports through Discovery Accelerator
Vault Cache builds from clients
Opening items through Browser Search and Archive Explorer
Opening items through shortcuts in the users mailbox
Index updates and rebuilds
Move Archive from one vault store to another store or to another ev site

So i guess the question is do you use Vault Cache and Virtual Vault in your environment?
Do you have any index rebuilds going on?
Do you use Discovery Accelerator in your environment?
What is the EV Server used for ? mailbox archiving? Journaling? both?
Are you doing any move archive operations?

Francis-T's picture

No Index rebuilds or move archive operations.

Don't use Vault Cache, Virtual Vault or Discovery Accelerator.

This EV server is for Exchange mailbox archiving and it also has a SharePoint archviing task, but the SharePoint Task is configured not to run against the site schedule or it's own schedule.

Journal archiving is completed via a 2nd EV server.

There would be almost no users online currently, so the only thing that should be actively hitting EV would be the backups.

The email archive would be around 2.5TB currently, partitions have been closed in the past, but I couldn't honestly say how big the current open partition is.

Something else being looked at is the requirement for leavers archives, there would be considerable benefit to taking them out of EV and taking a one off backup of exported PSTs. It would reduce both the size and the volume of small items that are killing the backup currently, unless we magic up a better way to improve the backups that is

Francis-T's picture

Backup wasn't a full, so it's finished already.

StorageOnlineOpns.exe is still high as far as resources go.

23,801 handles

2946K NP Pool (closest is down at 430K)

Server itself is under almost no load, 2% CPU & 3GB used from 16GB total

Francis-T's picture

Reports from users that they were starting to have issues again, nothing in the Event Logs this time (possibly caught it earlier this time?).

Restart of the services and all is grand again, StorageOnlineOpns.exe is down to 1544 handles and 184K non-paged pool memory.

I'll see about getting a call logged now

JesusWept3's picture

Hmmm, ok so you should probably concentrate on the application pool, making sure that its set to recycle itself often enough

Francis-T's picture

Symantec support recreated the IIS EnterpriseVault directory.

Keeping an eye on it now, if the fault re-occurs I've to confirm if the site becomes inaccessible as well and any error it presents.

Ryan Seymour's picture

Hi Francis-T

Has this issue been resolved since the recreation of the IIS EnterpriseVault Directory? We have the exact same issue on our EV FSA Server with the retrieval of Placeholders. I am interested to know so that we can do the same on this server and see if it fixes the problem before we go the route of logging a call with Symantec Support.

Many thanks,

Francis-T's picture

Unfortunately not Ryan.

It does appear to have improved it slightly though, previously it would happen every few days, after the recreation of the EnterpriseVault directory we got 6 days before it reoccurred.

In that time we could see StorageOnlineOpns.exe resources slowly creeping up as far as handles and non-paged pool.

Reached 24,000 handles and 2,900K NP Pool before it started failing.

At that time, the EVsite was still accessible, restoring an item from the vault back to the mailbox worked (as opposed to just double clicking an item to view it in the vault).

EV searches also worked.

Archive Explorer lists items, but gives an error when retrieving.

Managed to get a DTrace as well which is now in the hands of Symantec support to hopefully give them some indication of what is going on.

Francis-T's picture

I'm to apply http://www.symantec.com/docs/TECH35691 (AV exceptions were already in place, so just the MSMQ reg key) and rerun the EV Exchange permissions script tonight, will see what happens after that.

Ryan Seymour's picture

Hi Francis-T

To add my experience to this, we upgraded from 9.0.2 to 10.0.3 on the weekend of the 9th and since then we have not had so much as a spike on the storageonlineopns.exe process on our FSA EV server. This appears to have fixed it but we are only 1 week in and have had this length of up time in the past. However we would have seen the process spike to thousands of handles and 700 threads so I am very optimistic at this point.

Not sure what level of support you have with your EV vendor and if this upgrade is available to you but maybe chat to the guys that are handling your support ticket and see what they think?

Francis-T's picture

Morning Ryan,

Had almost forgotten about this thread, still having intermittent issues with EV though, the previous fixes did nothing.

Do believe Symantec Support have now given us the root cause of the issue though, and it does tie in with your fix;

http://www.symantec.com/docs/TECH155407

Do you have a 64bit SharePoint web front end that is archived by EV?

I was getting hung up on the EV backup, but now that Symantec highlighted the KB and SharePoint interaction, that would explain our issue.

We've been monitoring the handles of storageonlineopns.exe proactively and getting alerts when it hits 20,000, that way we can restart the service and keep EV running before users start to have issues.

In the meantime we're just confirming the trend of it's handles spiking during the SharePoint backup, but ultimately it does look like we'll need to upgrade.

SOLUTION
John Santana's picture

I was facing this issue as well but with FSA agent on my file server, the fix was to re-register the ASP.dll and then restart IIS service.

hope that helps.

Kind regards,

John Santana
IT Professional

--------------------------------------------------

Please be nice to me as I'm newbie in this forum.

Rob.Wilcox's picture

That particular issue relating to 64 bit SharePoint was fixed quite some versions ago, so as with anything it's good to upgrade to the latest (or nearly the latest) versions of the product.

Ryan Seymour's picture

Hi All

The article mentioned here by Francis-T explains the issue we have seen 100%. Our FSA Server is also running our SharePoint Archiving. Our SharePoint Farm is running on 2010 and our 2 front ends are 64 bit servers. I am certain that had we come across that article earlier and applied the patch mentioned to the relative servers on our previous 9.0.2 version of EV it would have resolve the memory leak we were seeing with the StorageOnlineOpns.exe process.

Thanks to all for thier input here. It has been extremely valuable and I hope this post helps others out there that may come across the same issue in time.