Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Random Backup Issue? - Error Code: e0008821

Created: 15 May 2006 • Updated: 21 May 2010 | 22 comments

Hello,

I have been having 'intermittent' problems with Backup Exec on one Windows 2000 server. Over the last week I had one daily backup job fail and also a full weekly job fail with the same error message below:

Error message: Error code: e0008821
Job was recovered as a result of Backup Exec RPC service starting. No user action is required.

When this happens the server end's up becoming 'un-responsive' and at this point I am forced to reset the server. After this problem occurs I can successfully run a backup job without an issue.

The BE error log point's toward the following Veritas url: http://seer.support.veritas.com/docs/278271.htm

Looking at this it say's the RPC service failing is the cause, I didn't manage to see if this service was stopped last time this happend so cannot verify that but it also mentions that this problem appears when running a backup of Lotus Domino server. We don't run Lotus, and can't think how this can be attributed to the problem.

The document also say's to install hotfix 17, is there a way to find out what hotfix's are currently applied to BE? I am running BE version 10.0 revision 5484.

Any help would be great.

Thanks

Comments 22 CommentsJump to latest comment

Gauri Ketkar's picture

HI,

-Perform the upgrade and try running the same jobs and check the performance ..


Symantec Backup Exec (tm) 10d rev. 5629 for Windows Servers Installation Files.
http://support.veritas.com/docs/279332


Step by Step installation instructions for the upgrade to Backup Exec 10d from Backup Exec 10.0
http://support.veritas.com/docs/281044

Update us on the same and revert for any further Query
Hope this will help you


Thank you
Gauri


NOTE : If we do not receive your reply within two business days, this post would be marked "assumed answered" and would be moved to "answered questions" pool.

Adam Prior's picture

Hi Gauri,

I hadn't planned on going to the length of upgrading BE as I don't believe in by doing this it will 100% solve the problem as I have seen others post about this problem who are already running 10d. Do you have any other idea's on what else I should look for?

I have been watching the backup's over the last few days and they have been fine, I want to continue to see how things go for the rest of this week at least and then I'll likely log a call for this.

(Please don't mark this as closed as it is an on going issue)

Thanks.

Eric Randall's picture

I am having the same issue and not using Lotus...

Is this install/upgrade included in a normal support agreement? How can I easily check to make sure I can do this? I am running 10.0 build 5520 but am at the end of the rope and would willingly try 10.0d

Last question? If I can upgrade, do I absolutely need to upgrade the Agents on the servers being backed up?

Thank you!!

-Eric

Amruta Bhide's picture

Hello Adam,
We will await an Update from you.

Eric,
An Upgrade from 10.0 to 10.1 is mostly a Free Upgrade. You could get in touch with our Sales department to check the validtity of your License.

We do recommend to push Install all the remote Agents with the new Agents right after the Upgrade.

If you do have any further Querries, we recommend you to open a new Forum Post.

Hope that helps.
******************************************************************
*****************************************************************

Note : If we do not receive your reply within two business days, this post would be marked �assumed answered� and would be moved to �answered questions� pool.


Thanks.

Scott Huntley's picture

Adam,
I just started having the same issues on a few of my servers and was wondering if the upgrade to 10.1 resolved the issues. I was running rev.5484 and tried upgrading one to rev.5520 this morning but still got the same error.

I was wondering if you are also receiving the following error in your event log:

Event Type:Information
Event Source:Save Dump
Event Category:None
Event ID:1001
Date:5/24/2006
Time:12:24:59 PM
User:N/A
Computer: -----------------
Description:
The computer has rebooted from a bugcheck. The bugcheck was: 0x000000d1 (0x00000060, 0x00000002, 0x00000000, 0xf72f007b). A dump was saved in: C:\WINDOWS\MEMORY.DMP.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

Adam Prior's picture

Hi Scott,

I haven't upgraded my BE software yet, in fact I haven't done anything apart from keep an eye on the server, the backup's have been fine for nearly two weeks now and I am waiting to see what happens. I chose not to upgrade because I noticed LOTS of people with this problem all seemed to already be on 10d and still had the issue. I will post back here again soon with an update.

I cannot verify if I recieved the error in the event log as I am unable to access the server now until next week.

shweta rege's picture

Hello,


Could you update us on the issue?


NOTE : If we do not receive your reply within two business days, this post would be marked assumed answered
and would be moved to answered questions pool.

*************************************************
************************************************

Adam Prior's picture

Scott - If it's any help, looking at the server event log It doesn't list the below bug check information on my server. - I'm thinking the problem I'm getting is related to the RPC service failing somehow....

Adam Prior's picture

Dear Shweta,

Yes I can update you, unfortunately the problem happened again on Thursday of last week. I don't really have any more logs or symptoms to post here which may be of help, I plan to give the technical people a call when I get a moment.

Shraddha Dhavale's picture

Hi Adam,

Please update us on the same.

Did you get the issue resolved?


Thanks.


Note : If we do not receive your reply within two business days, this post would be marked ?assumed answered? and would be moved to ?answered questions? pool.

Adam Prior's picture

Hi,

I don't quite understand what you mean? Although i've not had any problem this week, I want to leave this thread open for now, and contact support further when the time comes, it's pattern seems to becoming relevant for it to happen once every two weeks or so.

Deepali Badave's picture

Hello,

Please keep us updating on this issue.

NOTE : If we do not receive your reply within two business days, this post would be marked assumed answered and would be moved to answered questions pool

Eric Randall's picture

We've found our issue. We've built an entirely pristine 10.1 5520 SP2 environment on all new hardware except the library, drives and tapes. This is a duplicate of the old environment that the service continually fails on. The old server would never see the library for the revision of firmware it had loaded. Backup Exec RPC errors were just what I thought they were... the service was losing communications with the library for some reason.

With the new environment, the new install sees the library firmware. I'm getting better reporting and more accurate errors. What we've discovered is the library itself is the problem. The robotics are failing to fetch a cartridge repeatedly to change it after it is filled. With the new build, the library communicates this to the service and it no longer fails but correctly pauses the job and places the library offline.

End result, it's not BUE, it's the hardware. Mind you, both environments are set up with exactly the same drivers, firmware, etc.... the only MAJOR difference is I pond-skipped the LSI SCSI card and put in an Adaptec 39160.

Good luck!!

shweta rege's picture

Hello,


Thank You for the update...

Can we close the post now...



NOTE : If we do not receive your reply within two business days, this post would be marked assumed answered
and would be moved to answered questions pool.

*************************************************
************************************************

Adam Prior's picture

Hi Eric,

Glad you got your issue sorted. I've not had any further problems here in the last two weeks worth of backups (with it's existing failure pattern i'd expect it to happen this week, but it hasn't, yet.)

We dont have a robot library for tapes, we use a single AIT2 tape drive, I guess trying to find some BE logs to see what happens when this occours would be helpful, if I knew where to find them.

Shweta - No please don't close this as I am the thread starter and my problem has not be confirmed as solved.

Gauri Ketkar's picture

Hi Adam,


If this is happening once every two weeks or so , run SGMON before job starts and check for the error generated in text file called SGMON.log

http://support.veritas.com/docs/190979


Update us on the same and revert for any further Query
Hope this will help you


Thank you
Gauri

Adam Prior's picture

Hi Gauri,

Thanks for your reply. Just to update this thread, the problem re-occoured on Monday evening 12-06-06. It appeared that the machine was still functioning and pingable. The job was stuck in the running state and would not cancel. I next attempted to stop the BE services which most stopped apart from the BE engine kept timing out. At this point, the server locked up and a reset was needed. I have no further logs in the event viewer to go on, no bug checks or dumps. I re ran the backup the same day, no problem, and it's been ok all week, there seems to be a pattern in that it happens every two / three weeks at the moment.

Incidently, I ran the SGMON application on the server last night (the job completed) so I don't think anything helpfull was really logged?

Also - Is anyone else still having this problem running BE 10.1 (10d) ? and are you using a single tape drive for backup? Also, if anyone is running V 10 Rev 5484 and having similar experiences i'd like to hear from you.

CheersMessage was edited by:
Adam Prior

Asma Tamboli's picture

Hi Adam,

Could you install the latest device drivers for the version of backup Exec you are using? If not, please do update the drivers! Also stop the RSM service! Does the issue also occur when you run backups to a backup to disk folder?




NOTE : If we do not receive your reply within two business days, this post would be marked assumed answered and would be moved to answered questions pool.

Adam Prior's picture

Asma,

I have upgraded the tape drivers this week to the latest version from the manufacturer website. The removable storage service is running on all of the servers running BE products, why should this service be stopped?

Thanks.

Asma Tamboli's picture

Hi Adam,

Often the RSM could lead to conflicts with the device drivers. Hence as a troubleshooting step, we recommend stopping and disabling this service. Check out dokument http://support.veritas.com/docs/231679

Has the issue reoccured since the updation of the drivers?

NOTE : If we do not receive your reply within two business days, this post would be marked assumed answered and would be moved to answered questions pool.