Monitoring the deduplication rate

Article:HOWTO36313  |  Created: 2010-11-01  |  Updated: 2011-01-17  |  Article URL http://www.symantec.com/docs/HOWTO36313
Article Type
How To


Environment

Subject


Monitoring the deduplication rate

The deduplication rate is the percentage of data that was stored already. That data is not stored again.

NetBackup reports the rate of deduplication as follows:

  • The Deduplication Rate column of the Activity Monitor Jobs tab.

  • The Job Details dialog box.

    The Detailed Status tab shows detailed information, including the deduplication rate.

    The information depends on whether it is media server deduplication or client-side deduplication, as follows:

    • For media server deduplication, the Detailed Status tab shows the deduplication rate on the server that performed the deduplication. The following job details excerpt shows details for a client for which Server_A deduplicated the data (the dedup field shows the deduplication rate):

      10/6/2010 10:02:09 AM - Info Server_A(pid=30695) 
        StorageServer=PureDisk:Server_A; Report=PDDO Stats 
        for (Server_A):   scanned: 30126998 KB, stream rate: 
        162.54 MB/sec, CR sent: 1720293 KB, dedup: 94.3%, cache 
        hits: 214717 (94.0%)

      The other fields that show deduplication information are highlighted in the example. For the field descriptions, see Table: Deduplication activity field descriptions.

    • For client-side deduplication jobs, the Detailed Status tab shows two deduplication rates. The first deduplication rate is always for the client data. The second duplication rate is for the disk image header and True Image Restore information (if applicable). That information is always deduplicated on a server; typically, deduplication rates for that information are zero or very low. The following job details example excerpt shows the two rates:

      10/8/2010 11:54:21 PM - Info Server_A(pid=2220) 
        Using OpenStorage client direct to backup from client 
        Client_B to Server_A  
      10/8/2009 11:58:09 PM - Info Server_A(pid=2220) 
        StorageServer=PureDisk:Server_A; Report=PDDO Stats for 
        (Server_A): scanned: 3423425 KB, stream rate: 200.77 
        MB/sec, CR sent: 122280 KB, dedup: 96.4%, cache hits: 
        49672 (98.2%)
      10/8/2010 11:58:09 PM - Info Server_A(pid=2220) Using 
        the media server to write NBU data for backup 
        Client_B_1254987197 to Server_A
      10/8/2010 11:58:19 PM - Info Server_A(pid=2220) 
        StorageServer=PureDisk:Server_A; Report=PDDO Stats 
        for (Server_A): scanned: 17161 KB, stream rate: 1047.42 
        MB/sec, CR sent: 17170 KB, dedup: 0.0%, cache hits: 0 (0.0%)
      the requested operation was successfully completed(0)
  • The bpdbjobs command shows the deduplication rate if you configure a COLDREFS entry for DEDUPRATIO in the bp.conf file on the media server on which you run the command.

    See the NetBackup Administrator's Guide for UNIX and Linux, Volume I.

Many factors affect deduplication performance.

See About deduplication performance

Table: Deduplication activity field descriptions

Field

Description

cache hits

The percentage of time that the local fingerprint cache contained a record of the segment. The deduplication plug-in did not have to query the database about the segment.

CR sent

The amount of data that is sent from the deduplication plug-in to the component that stores the data. (In NetBackup, the NetBackup Deduplication Engine stores the data. In PureDisk, a content router stores the data.)

If the storage server deduplicates the data, it does not travel over the network. The deduplicated data travels over the network when the deduplication plug-in runs on a computer other than the storage server, as follows:

  • On a NetBackup client that deduplicates its own data (client-side deduplication).

  • On a fingerprinting media server that deduplicates the data. The plug-in on the fingerprinting server sends the data to the storage server, which writes it to a Media Server Deduplication Pool.

  • On a media server that then sends it to a PureDisk environment for storage. (In NetBackup, a PureDisk Storage Pool represents the storage of a PureDisk environment.)

dedup

The percentage of data that was stored already. That data is not stored again.

scanned

The amount of data that the deduplication plug-in scanned.

stream rate

The speed of the scan: The kilobytes of data that are scanned divided by how long the scan takes.

See About deduplication server requirements

See About client deduplication requirements and limitations


Legacy ID



v28802759_v47623180


Article URL http://www.symantec.com/docs/HOWTO36313


Terms of use for this information are found in Legal Notices