BUG REPORT: bpbkar32.exe faults on the inactive node of a Windows 2008 cluster

Article:TECH128805  |  Created: 2010-01-14  |  Updated: 2012-12-11  |  Article URL http://www.symantec.com/docs/TECH128805
Article Type
Technical Solution

Product(s)

Environment

Issue



bpbkar32.exe faults on the inactive node of a Windows 2008 cluster causing a Status 13.  In some NetBackup versions, bpbkar32.exe may not fault, but the snapshot may fail causing a Status 156.


Error



EXIT STATUS 42 - network read failed
EXIT STATUS 13 - file read failed
Status 131 - Connection reset by peer


Solution



ISSUE:
BUG REPORT: bpbkar32.exe faults on the inactive node of a Windows 2008 cluster or the job may end with Status 156.

ENVIRONMENT/CONDITIONS:
This issue is new to Windows 2008 and Windows 2008 R2 NetBackup Clients.
 
The sequence of events which cause this error are:
1.  On a NetBackup Client, an application installation creates a Service with a "Path to executable" that points to a non-local volume.
2.  The NetBackup Client is a node in a cluster
3.  When the machine is the Passive Node in the cluster, the non-local volume which the Service points to, is no longer present on the Operating System
4.  A Shadow Copy Components backup is run on the Passive Node in the cluster
 
Newly in Windows 2008 and Windows 2008 R2, every "Path to Executable" for every Windows Service is cataloged into the VSS System Writer and is therefore backed up by Shadow Copy Components.  When a Shadow Copy Components backup starts, all of the VSS Writers are interrogated to compile a list of Volumes on which files classified into Shadow Copy Components reside - then Microsoft takes a Snapshot of those volumes.
 
The Microsoft Volume Shadow Copy service is unaware that the node is Passive and the drive letter for the Service is not present.  It attempts to snap the absent volume and an unexpected error occurs, which causes bpbkar32.exe to produce an Application Fault.
 
Services were not handled this way in Windows 2003.   It is likely that Application Vendors are unaware of the changes in Windows 2008 and 2008R2 which contribute to this error.   Application Vendors should be notified that they need to change their installation steps to avoid installing the applications on Quorum volumes.   Applications should be installed on local volumes, and only the shared data should be installed on the Quorum volume.
 
Even the native Windows 2008 and Windows 2008 R2 System State backup utility wbadmin.exe is subject to this change in the OS handling of Service "Path to Executable" files by VSS:
 
The hotfix provided by Microsoft in KB article 980794 only works with their native backup utility wbadmin and does not work with 3rd party backup utilities.


EVIDENCE:
Application Event Logs:
Tue Mar 2 2010 21:24:52 Application Error E1000 Faulting application  bpbkar32.exe, version 6.5.2009.430, time stamp 0x49fadc7b, faulting module bpbkar32.exe, version 6.5.2009.430, time stamp 0x49fadc7b, exception code 0xc0000005, fault offset 0x00000000000ce757,

last message in bpbkar before the fault: (6.5.4)
14:15:31.326: [2260.6480] <2> ov_log::V_GlobalLog: INF -   o: C:\Windows\system32\config\TxR\AppData\Local\Microsoft\Outlook\*.ost
14:15:33.604: [2260.6480] <2> ov_log::V_GlobalLog: INF - Status E_FS_VOLUME_NOT_FOUND (0xE0000352) for object 'System Files'.  Faild to resolve volume path name for 'f:\app\monitoring\tivoli\lcf\bin\w32-ix86\mrt\' in SHADOW::SetSelectedForBackup
14:15:33.604: [2260.6480] <2> ov_log::V_GlobalLog: INF - Status E_FS_VOLUME_NOT_FOUND (0xE0000352) returned setting selected for backup for object System Files in SHADOW::AddToSnapshotSet
14:15:33.604: [2260.6480] <2> ov_log::V_GlobalLog: INF -   AD:Trouble adding to set - Status E_FS_VOLUME_NOT_FOUND (0xE0000352) in SystemState::AddToSnapshotSet:573
14:15:33.604: [2260.6480] <16> dos_backup::V_InitializeSystemState:

last message in bpbkar before the fault: (6.5.5)
3:11:01.265 PM: [1580.2188] <2> ov_log::V_GlobalLog: INF -    Adding c:\windows\syswow64\en-us\propsys.dll.mui to the backup file list
3:11:01.265 PM: [1580.2188] <2> ov_log::V_GlobalLog: INF -    Adding c:\windows\syswow64\KBDKHMR.DLL to the backup file list
3:11:01.265 PM: [1580.2188] <2> ov_log::V_GlobalLog: INF -    Adding c:\windows\syswow64\odbcint.dll to the backup file list
3:11:01.265 PM: [1580.2188] <2> ov_log::V_GlobalLog: INF -    Adding c:\windows\syswow64\XpsRasterService.dll to the backup file list

BEDS xml showing the next file on the list is a non cluster application installed on a Cluster shared drive:
<FILE_LIST path="c:\windows\syswow64" filespec="xpsrasterservice.dll" filespecBackupType="3855" />
<FILE_LIST path="e:\program files (x86)\sap businessobjects\financial consolidation" filespec="ctcontroller.exe" filespecBackupType="3855" />


SOLUTION/WORKAROUND:
FIXED IN:  For NetBackup 7.1, a fix for this issue is included in Hotfix NB_7.1_ET2256525_2 available for download through TECH156114 (linked below).
FIXED IN: This issue is formally resolved in 7.1.0.1
!!! WARNING !!!: This issue returned in 7.5 GA - see artice TECH199089 (linked below) for details.

A workaround would be to reinstall the clustered application to a local drive letter which does not disappear when the cluster node becomes passive.  Configure the application so that only the cluster shared data is on the quorum volume.
 
Symantec can provide a binary for this issue which does two things:
1. Prevents bpbkar32.exe from producing an application fault when VSS attempts to snap an absent volume.  Instead, the bpbkar32.exe will end gracefully and the job will complete with a Status 156 (snapshot error encountered)  in Activity Monitor.
 
2. The binary will enable the user to type in a drive letter or drive letters which are "OK to fail" with snapshot errors during Shadow Copy Components backup jobs.   The backup job will attempt to snap all volumes indicated by the VSS writers, and if a volume produces a snapshot error, the backup of System State will proceed.   This way, if the node happens to be the Active Node in the cluster, and the volume is present, it will get successfully snapped.   If the node is the Passive Node in the cluster and the volume is absent, Shadow Copy Components backup will proceed past this condition.
 
Syntax:
w2koption -backup -ignore_unresolved_volumes <volume:>
 
Example if drive letter F is missing:
w2koption -backup -ignore_unresolved_volumes F:
 
Example if drive letters F and G are missing:
w2koption -backup -ignore_unresolved_volumes F:G:
 
 
WARNING: This methodology is not without risk.  NetBackup is unable to analyze why a volume is absent.  If the node is the Active Node in the cluster and the volume should rightfully exist - but does not for whatever reason, NetBackup will proceed with the Shadow Copy Components backup as specified by the user - and data which should have been backed up will not be backed up.   The absolute best solution for this condition is for the end-customer to collaborate with the problematic application vendor to re-architect the application to install to a local volume.

Symantec Corporation has acknowledged that the above-mentioned issue is present in the current version(s) of the product(s) mentioned at the end of this article. Symantec Corporation is committed to product quality and satisfied customers.

Please refer to the maintenance pack readme or contact NetBackup Enterprise Support to confirm this issue (ET1990870, ET2015296) was included in the maintenance pack.  

As future maintenance packs and release updates are released, please visit the following link for download and readme information:  
http://www.symantec.com/business/support/index?page=landing&key=15143


 

Supplemental Materials

SourceETrack
Value2118052
Description

bpbkar32.exe producing Application Fault during dfsr backups on Windows 2008 and 2008R2 Core


Value1990870
Description

bpbkar faulting on inactive node of cluster with set1926700 installed


SourceETrack
Value2144264
Description

ET 2015296 causes backup of SCC to end with Status 156 


Value42
Description

network read failed



Legacy ID



350325


Article URL http://www.symantec.com/docs/TECH128805


Terms of use for this information are found in Legal Notices