A potential for data loss has been discovered when a NetBackup PureDisk content router meets or exceeds its "Low Space Threshold" and remains above this threshold. In some cases, the backup job may complete successfully (exit status 0), even though data was not completely written due to the content router being in this state.
| Article:TECH64775 | | | Created: 2008-01-04 | | | Updated: 2009-01-20 | | | Article URL http://www.symantec.com/docs/TECH64775 |
Problem
A potential for data loss has been discovered when a NetBackup PureDisk content router meets or exceeds its "Low Space Threshold" and remains above this threshold. In some cases, the backup job may complete successfully (exit status 0), even though data was not completely written due to the content router being in this state.
Solution
Introduction:
A potential for data loss has been discovered in cases where a PureDisk content router meets or exceeds its "LowSpaceThreshold" and remains above this threshold. This is set to 85% disk utilization by default.
The "LowSpaceThreshold" is configured in the contentrouter.cfg configuration file on the PureDisk Content Router. When the utilized disk space on the content router meets or exceeds this level, the content router will no longer accept backups and will send abort messages to clients to prevent further backup data from being sent.
What is Affected:
The following versions of PureDisk are affected, on all supported platforms:
A potential for data loss has been discovered in cases where a PureDisk content router meets or exceeds its "LowSpaceThreshold" and remains above this threshold. This is set to 85% disk utilization by default.
The "LowSpaceThreshold" is configured in the contentrouter.cfg configuration file on the PureDisk Content Router. When the utilized disk space on the content router meets or exceeds this level, the content router will no longer accept backups and will send abort messages to clients to prevent further backup data from being sent.
What is Affected:
The following versions of PureDisk are affected, on all supported platforms:
- NetBackup PureDisk Remote
Office Edition 6.1, 6.2, 6.2.x (Scenario 1 only)
- NetBackup PureDisk Remote
Office Edition 6.5, 6.5.0.x
How to Determine if Affected:
Data loss has been known to occur if ALL conditions of either scenario are met.
Scenario 1 A Full Content Router can result in data loss in files less than or equal to the segment size. (The default segment size is 128 KB for files and directories).
- The PureDisk content
router is running one of the versions mentioned above in a supported
configuration.
- The Content Router meets
or exceeds the LowSpaceThreshold.
- The Client attempts to
write a small segment of data to the content router (for example, a file that is
smaller than the segment size).
- These smaller
segments are sent to the content router, but not committed properly to the
database. Subsequent database maintenance operations will groom this
data.
Scenario 2 The client's pdbackup process does not deal properly with abort message from the Content Router. (Affects 6.5, 6.5.0.x only)
- The PureDisk Client is
running one of the versions mentioned above in a supported
configuration.
- The Content Router that
the client is writing to meets or exceeds the
LowSpaceThreshold.
- The Client sends data to
the content router. The Content Router has met or exceeded its space threshold
and sends an "abort" message to client.
- The Client's
pdbackup process misinterprets the abort message on the client.
- As a result, there are
records of data on the content router that do not actually exist.
If Scenario 2 occurs, all file
and folder information sent to the content router is potentially
affected. If it is probable that the LowSpaceThreshold is met or exceeded
on the content router, it is strongly advised to address this immediately by
applying 6.5.1 when available, or to ensure that sufficient disk space is
available to prevent the threshold from being exceeded.
Symptoms Occurring on Backup:
If issue described in Scenario 2 occurs, the following message will occur in the /Storage/log/spoold log on the content router:
data
store failed: could not spool object
If the LowSpaceThreshold is exceeded on a Content Router, the following message will occur in the /Storage/log/spoold log on the content router:
Could
not write data to data store, error: spool directory out of
space
Symptoms Occurring on Restore:
Files that experience this issue will be seen in the user interface for restore, but will exhibit an error when an attempt is made to restore them.
In the rare case that this issue occurs, restores from an affected backup will restore up until the point of where the issue occurred, then fail. In the details of the restore, "no such object" or missing segment messages such as the following will be seen:
Failed
to restore /tmp/root/tree_10000/_54/12 (at line 904 in input) (no such
object)
80508d2efecaf7398ff50241a9b11b1f:
get request failed for segment <segment id> (0 out of 2 segments processed
(unknown error)
Formal Resolution:
To resolve this issue, apply the NetBackup PureDisk Remote Office Edition 6.5.1 as soon as becomes available. This is currently scheduled for release in Q1 of calendar year 2009.
Please note that the formal resolution will prevent this issue from occurring to future backups, but cannot recover missing data from affected backups. Note also that this release will not prevent disk or network errors from occurring, but will take additional actions in failing the backup job at the moment the above conditions are present. It is strongly recommended to apply the PureDisk 6.5.1 patch as soon as possible to ensure the issues described in this document are not encountered.
Workaround:
A direct workaround for this issue is not currently available. However, it is highly recommended that the Server Requirement and Capacity Planning sections be referenced in the PureDisk Best Practices guide to help prevent the Content Routers from exceeding the configured space usage:
http://seer.entsupport.symantec.com/docs/303544.htm
If it is believed that the PureDisk configuration may be affected by this issue, the 6.5.1 update should be applied when available. If 6.5.1 is not yet available or cannot be applied, it is recommended to contact Symantec Enterprise Technical Support, referencing the number of this article (TechNote 313465).
Symantec Strongly Recommends the Following Best Practices:
1. Always perform a full backup
prior to and after any changes to your environment.
2. Always make sure that your
environment is running the latest version and patch level.
3. Ensure that the content
router is allotted sufficient disk space and check system logs regularly to
ensure that no disk or system level issues are present.
How to Subscribe to Software Alerts:
If you have not received this TechNote from the Symantec Email Notification Service as a Software Alerts, please subscribe at the following link:
http://maillist.entsupport.symantec.com/subscribe.asp
|
|
| Source | ETrack |
| Value | 1442824 |
| Description | ETrack (PureDisk) 1442824: pdbackup does not deal properly with abort message from the CR. |
| Source | ETrack |
| Value | 1442788 |
| Description | ETrack (PureDisk) 1442788: Full CR can result in data loss for files with a size less than or equal to the segment size. |
Legacy ID
313465
Article URL http://www.symantec.com/docs/TECH64775
Terms of use for this information are found in Legal Notices









Thank you.