Video Screencast Help
Search Video Help Close Back
to help
New in the Rewards Catalog: Vouchers for "Symantec Technical Specialist" and "Symantec Certified Specialist" exams.

Recovery Solution scheduled backup issues - RPC Service Unavailiable

Updated: 24 Aug 2010 | 7 comments
Bob Schierholz's picture
0 0 Votes
Login to vote
This issue has been solved. See solution.

I am trying to resolve some ongoing issues we are having with scheduled snapshots not happening or only happening sparadically.  Many of the computers have this error in the AeX RSA Events on the console

Error accessing the Recovery Solution Server. Cannot create interface to object.   Local Event Code: 0xc00c002b System Event Code: 0x800706ba: The RPC server is unavailable.   

We currently schedule our backups between 1 and 6am every Tuesday.  We have ~1500 total machines.  Is it possible that we are overwealming our server with requests and causing the above errror?  Is there a best practices document around Scheduled snapshots?

Comments

KSchroeder's picture
18
Mar
2010
0 Votes 0
Login to vote

Server overload

Bob,
Could be.  Do you also utilize the "Enable snapshot on idle" option?  What are your maximum concurrent snapshot settings configured for in the Agent settings?  Do you have the option "Allow agent to initiate scheduled snapshot" turned on?  If so, I would turn that off.  If all the Agents hit the RS at 1 AM trying to start their snapshot, I could see this happening.  In some cases the RS service might crash (we have had this occur on a cluster with 1 server and ~2000 clients trying to "snapshot on idle" at ~3:00 AM).

What RS Solution/Server/Agent version are you running?  If you haven't already, I would recommend updating to the latest 6.2 SP3 of RS 6.x.  There are also several hotfixes available.

There are several Best Practices documents available:
AKB47687 (PDF from a previous year's ManageFusion conference)
AKB26764 (Word document; a bit old but still good information)
AKB45676 (RS 6.2 SP3 hotfixes)

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Bob Schierholz's picture
18
Mar
2010
0 Votes 0
Login to vote

We set the simultanious to 10

We set the simultanious to 10 and we are running 6.2 sp3.  After some more digging since I posted yesterday I found the following in the Administrators guide.

 

As soon as user computer goes into idle state
Check this box to have Recovery Solution start the snapshot scheduled for each day as soonas the protected computer goes into an idle state.A snapshot is started only if there is one scheduled to take place at the same time or later onthe same day, or if the end time for the snapshot scheduled for the previous day has not yetbeen reached. A snapshot is not started if its scheduled time has passed, even if one did nottake place during the scheduled time period. A snapshot runs only once each day, no matter how many times the computer goes into an idle state.

The way this reads to me is that since we have our scheduled snapshot during the 6 hours in the early morning once per week and we only allow 10 connections at a time some machines never even get a chance to start thier snapshots.  And, if a machine doesn't start it's snapshot during this time period it will not try again till the next week when it is scheduled again. But, if we changed our schedule to be the last 5 hours of the day then this idle state rule would be in effect all day long.  Does that sound like it could be part of our issue?  It seems, though I havn't been able to verify, that if a machine tries and then fails to get a snapshot durring that time the server automatically sets up another scheduled attempt at a later time but that is for failed attempts not for those that didn't get a chance to try.

KSchroeder's picture
19
Mar
2010
0 Votes 0
Login to vote

Hardware?

Bob,
What sort of hardware is your RS running on?  You should be able to bump up the maximum simultaneous snapshots quite a bit (at least 2-3x the current) without trouble; typically a lot of the snapshot time is spent checking for files that already exist on the RS, and not in actually transferring data to the storage.  Also be sure to set the manual maximum concurrent snapshots to be a bit higher than the scheduled; these numbers are not cumulative; if you set 20 scheduled and 20 manual, then you'll still only get 20 snapshots going.  In this case I would suggest setting the Manual to 25.

If you really want the snapshots to only happen on your specified time frame, you should turn off snapshot on idle.  Also, any particular reason you only take one snapshot per week?  We run them M-F here, and typically they complete for LAN users within 10-15 minutes.  We set our default schedule from 11 AM ~ 11 PM so many users get their snapshot during lunch.

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Bob Schierholz's picture
19
Mar
2010
0 Votes 0
Login to vote

Hardware

The hardware should be able to handle the simultanious snapshots but there was concern about bandwidth with our managment because we have a cental RS Server but our operations are spread out and have to come across a WAN to backup.  We were taking only one snapshot per week because of a misunderstanding on how the misc settings work.  The thought was we would only "force" a snapshot once a week but that the 15 min idle time would handle the straglers the rest of the week.  The corperate culture here is very careful not to disrupt our users.  So we sometimes make decisions that seem odd from the outside looking in.

We have decided to setup a new schedule for M-F with a window of 7pm to 11:59 pm with our snapshots set to 20/25 for nighttime and 10/15 for daytime.  We are also likely to use a separate schedule for all the machines outside of the headquarters, where the RS server is, so we can tighten down the throttling somewhat.

Thanks for the input, it has been very helpful so far.  I will let you know if it makes a difference.

KSchroeder's picture
19
Mar
2010
0 Votes 0
Login to vote

We have ~150 sites across the

We have ~150 sites across the US that backup to our central RS servers.  The nice thing with this configuration is that it is unlikely that too many machines at a single site will kick off at the same time (if you enable SOI).  Also, take a look at some of your Snapshot History reports; in general the actual amount of data transferred per client probably won't be all that much.  With the Redundant Block and Redundant File Elimination (RBE/RFE), the actual number of MB across the wire should be fairly low.

We have a similar culture; no forced reboots, etc.  We did enable the Agent option to let the user schedule their snapshots in the case where some people have a very particular schedule they would like to use.  I guess for us part of comes down to do you want to have users who are happy if their hard drive crashes, and you have their important documents from yesterday, versus the ones from last week...

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.

Bob Schierholz's picture
13
Apr
2010
0 Votes 0
Login to vote

Snapshots are happing consistantly now

Since I changed my backup schedule to 7 Days a week 7 pm to 11:59 pm and 15 min idle time for the rest of the day.  Now I am having issues with space.  I'll start a new thread for that.

KSchroeder's picture
13
Apr
2010
0 Votes 0
Login to vote

Great

Bob,
Glad you got the snapshot schedule sorted out and your RS has stabilized.  Let's see what we can do to keep you SSM working and recovering space (now that it is successfully being used! :) ).

Thanks,
Kyle
Symantec Trusted Advisor

For Forum threads, please click "Mark as Solution" if answered.
For all content, please give a thumbs up if you agree with or support the post.