Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Single Backup Job Failing: System?State - 1 item skipped

Created: 18 Mar 2013 | 8 comments

Hello all.

I've suddenly encountered an issue causing a single backup job to fail.  This backup job has been running for probably 6 months + without issue and we have not recently installed / edited anything on the server that is encountering the failure.  The backup job in question (all of our backup jobs, actually) were recreated from scratch post migration to BE2012.

BE2012 - (latest version / fully patched) - is running on Win 2k8r2 svr.
The server having failed back-ups is [Win 2k3 (32-bit) acting as a Domain Controller].
As previously stated the issue began suddenly about 3 weeks ago with no known environmental changes that would have been a cause.

There are no corresponding issues in the Windows event logs when the backup job fails that would correspond to the failure.
There are info-entries in the event logs from VSS service corresponding to the backup job's time-frame and the entries do not show any errors.  The messages are: 'lsass (476) Shadow copy 1 freeze started' & 'lsass (476) Shadow copy 1 freeze stopped'.
I also see in the BE Job Log that an amount of data was backed up.  When comparing the amount of data backed up from a failed job to the amount of data backed up by a job prior to the failures beginning, the amount of data is close, within a few bytes / Kb.  Also, the verification step for the failed backup job is verifying the backed-up data to within a few bites / kbytes.
(I suspect the minor byte differences may be due to differences in drive architecture/block-size between the source and destination volumes.)

The failed backup jobs logs are specifically citing the System?State as the area where the failure is occurring, I am receiving the generic and ever-so-cryptic message - 'V-79-57344-33928 - Access Denied. Cannot backup directory  and its subdirectories.'  I have enabled the additional path/file logging in BE for all jobs by default in hopes of catching the offending item, however the results of the job including the expanded path / file logging even seem to be very broad and non-descriptive.
I believe within this message there is a space into which the offending directory should be listed, but is not in this case.

The job log displays the following in the job details section:
Directory \
Shadow Copy Writer Active Directory
Shadow Copy Logical Directory Active Directory\C:_WINDOWS_NTDS
Shadow Copy Component ntds
Shadow Copy Writer COM+ Class Registration Database
Shadow Copy Component COM+ REGDB
Shadow Copy Writer Event Logs
Shadow Copy Component Event Logs
Shadow Copy Writer Registry
Shadow Copy Component Registry
Shadow Copy Writer System Files
Shadow Copy Component System Files
Shadow Copy Writer SYSVOL
Shadow Copy Logical Directory SYSVOL\SYSVOL
Shadow Copy Component 0ddf9507-77af-419a-a6c2e0b0802bcbaa
Shadow Copy Writer Windows Management Instrumentation
Shadow Copy Component WMI
V-79-57344-33928 - Access Denied. Cannot backup directory  and its subdirectories.

The job log also displays the following in the job summary section:
Backed up 14 System State components
1 item was skipped.
Processed 1,916,355,897 bytes in  # minutes and  ## seconds.
Throughput rate: YYY MB/min
Compression Type: Software
Software compression ratio: X:X

The job logs (Detail and Summary) for a successful backup job on the same server (pre failures beginning) display exactly the same items with the only exception being the successful backups do NOT show '1 item was skipped' in the summary section.

I have performed the following troubleshooting:

  • Checked win event logs for corresponding (or even quasi-related) items and found nothing definitive.
  • Checked the 'vssadmin list writers' output for issues - all writers listed return 'Stable / No_Errors'
  • Checked the RAWS\Logs directory on the server in question and not found anything difinitive in the way of errors.  There are several XML files in this directory that seem to correspond to the above mentioned vss writers - with the exception of one .xml file - for '{GUID}-BITS Writer' - that does not appear in the list of vss writers returned by vssadmin.exe.  I've checked other servers that have NOT encountered any backup failures and the same {GUID}-BITS Writer.xml file exists in the RAWS\Logs directory and is also NOT included in the list of vss writers returned by VSSAdmin.exe.  Based on this I am assuming that the missing VSS BITS Writer is NOT the cause of the issue.
  • Tested the credentials - all passed, no issues.
  • Test-run the job in question - Test runs succeed without issue
  • Created a new 1 time backup-to-disk job with the System?State and C: volumes selected.  The 1 time backup completed successfully and without issue.  I had the logging turned up to folder/file details before creating this 1 time test backup.  The job log for System?State (both detail and summary) is exactly the same as with the failed job.  Again, the only difference between them is that the successful job does NOT mention '1 item skipped'.
  • I've been through most if not all of the other forum posts and Symantec KB articles having to do with this issue.  In reviewing these posts/articles I have not found any definitive misconfigurations or other discrepancies that can be contributed to the failure.  I hesitate to begin installing arbitrary MS hotfixes/patches without proof that the issue the hotfix is addressing is present on the system in question.

So, I am having the damnest time figuring out WHICH ITEM WAS SKIPPED...  It would help if I could turn on some kind of additional logging at the agent on the server at which the job is failing.

  • If someone has some insight on how to find the non-existent missing System?State component, please point me in the correct direction.
  • Additionally, if anyone has any idea how I might turn up some additional logging for this particular job or agent on the server on which the job is failing, please let me know.
  • I wish to try to identify the actual failing operation including the error message that is being returned to the BE Agent by the OS (if possible).

This job only runs once a week.

Thanks in advance,

Operating Systems:

Comments 8 CommentsJump to latest comment

OP's picture

Have you tried to enable the debuglog? Via registry?

Colin Weaver's picture

If there is a unix style / instead of a windows style \ against a path in the registy to something deemed as part of a System State then you might get this problem. The only way to find  it is to do an agent debug as mentioned by OP, go through this debug looking for the line that identifies the invalid path and then using that to go through your registry (possibly consulting any relevant 3rd party software vendor in the process)

M Strong's picture

Thank you both for the thoughts.

OP:  I've found this article:  http://www.symantec.com/business/support/index?page=content&id=TECH23853
I will set this up before this coming weekend's backups and see what those logs provide.
I will attempt to update the thread with my findings.
Thank you.

Colin:  I had thought about that, however I am not sure where in the registry I would go searching for a list of those components (looks like that information could be spread out quite extensively).

I am curious though, as to why a NEW test-backup job that contains both [System?State] and [System Volume] - which is recognized by BE as being enough for a 'SDR Backup' - completes successfully, citing all of the same 14 Shadow copy components (BE logging set to include path/file details).
However, the original backup still fails, listing the same 14 components but stating that 1 was skipped.

RE:  New Test-Backup Job:
The server in question also has a large file-store volume that is included in the original (failing) backup job.  I have excluded backing up the file-store volume in the test job as it is not necessary for a successful SDR backup to complete and the issue seems to specifically be with System?State.

Since the test job includes [system?state] & [system volume] and successfully completes an SDR Backup, I would think it safe to assume that the [System?State] data was not referring to anything in the file-store volume and as such it was safe to exclude the file-store volume from the test backup.  If this assumption is incorrect, please by all means let me know.

What are the chances of something like the original (failing) backup job becoming corrupted?  (Selection list gone bad..).  I've checked, emptied and re-selected the selection list for the job in question which did not seem to make a difference, the next running of the job still failed citing the same exact issue...

I had thought about deleting and re-creating the job, however I wanted to exhaust other possibilities - such as identifying the errant file(s) - first.

Thank you all for the assistance!  smiley

-M

Colin Weaver's picture

if it is the unix style / character then you need the path from the error in debug log first. Then you try and find that path in the registry probably starting with the area that holds the services details and where the files are that each service uses to start.

M Strong's picture

Ok, got a sample debug log from the remote agent on the server in question.  I've isolated a portion of the log data pertaining to the system?state portion of the backup job that is failing - based on 'beginning job', 'ending job' - time-stamps from the job log.

I parsed this for anything that would stand out as being the problem:  (the v-xx-xxxxx-xxxxx code; a 0x05 code; the string 'Access Denied', and a few other items and I did NOT find anything that stood out as being a critical failure.

I searched for - and found - 14 instances of the statement:
--Informational: calling IVssBackupComponents::SetBackupSucceeded with status 'SUCCESS (0x00000000)' for Component 'XXX' in SHADOW::CloseComponent
(Where 'XXX' represents the name of the Shadow Copy component having successfully copleted.)
The number of instances (14) matches the number of shadow copy components as listed by the job log.  Still nothing specifically citing 'Item Skipped'

Now, all of that being said, I did find one row in the debug log that seems to stand out as being odd, although I can't seem to isolate anything corresponding in the Win Registry.  The log entry had unicode(?) characters that, once copied into MS Excel, showed as asian language characters.

  • Informational: Object:'\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy6\windows\sysvol\staging areas\LBUGROUP.LOCAL' MSFT Reparse Tag:'IO_REPARSE_TAG_MOUNT_POINT (0xA0000003)' and target is [C:\WINDOWS\SYSVOL\staging\domain]

EDIT: Had to remove the asian language characters, Symantec forum editor control would not allow them.

I cannot be certian if what appeared when copied to Excel is a correct representation of what was actually supposed to have been written to the debug log.  I know that sometimes things can be interpereted differently based on font and/or encoding settings between applications.

I have searched the registry for all instances of [C:\WINDOWS\SYSVOL\staging] in an attempt to isolate the key in question.  I did not find anything that stood out as immidiately untoward.
As well I looked through the file system at the path in question and not found anything beyond [C:\Windows\Sysvol\Staging\Domain] - (the [domain] folder is empty).  I checked using the command line for anything that may have been hidden from explorer, (Yes, I have hidden and system files showing via windows explorer File --> View Settings.), and the cli also showed nothing.

Digging even deeper, I invoked the old Sysinternals RootKit reveiler (for what it's worth anymore) and didn't see anything related to the log entry or similar path at all.  I understand that there are some default NTFS metadata items that may/may not appear based on settings of RKR.  I also took a baseline from another Win 2k3 server to compare what I was finding.  Again, nothing really stood out to me as being errant.

So, I guess my next question for you Guru's out there is:  What should I be looking FOR in the debug logs for the Windows Agent that might point me in the direction of my failure?  And/or:  If the above item (containing the Asian language characters) IS what I should be concerned with, can you suggest how I might further find this in the registry and/or file-system?

      Again, Thank you in advance...

-M

Colin Weaver's picture

We have had issues with double byte (cyrillic/asian/greek etc) characters in the past.

However I think at this point you probably need to log a formal support case for us to properly review the logs.

M Strong's picture

Colin, I've opened a support case as per your recommendation. As soon as someone touches base with me I will supply the debug log from the failed backup job and update this thread if/when I receive any info back about the cause/solution of the issue.

Case# 03990971 - in case you are interested.

Thank you,

-M

Jivo's picture

I am looking at the same error trying to back up a Windows 2008 x64 domain controller. Job Log reports that 18 System State components were backed up, and 1 was skipped. I presume that it's a WMI component since the error (V-79-57344-33928 - Access Denied ....) appears below "Shadow Copy Component WMI"

I am pretty frustrated becase of the fact that the most basic "complete backup" of one server ended up in a two months debugging adventure for no apparent reason.

I am looking forward to whatever you will receive from tech support.