Thanks to everyone who supported my earlier blog on Frequently Asked Questions on NetBackup Accelerator. I have received a number of follow up questions (as comments) in that blog and I had been answering them as time permits. Two recent questions needed some elaboration. Thus, I thought it is better to put them as a Part II in a new blog.
Can I use NetBackup Accelerator to backup Network Attached Storage (NAS) devices?
Yes, you can use NetBackup Accelerator to protect any NAS device that supports NFS and/or CIFS protocols. You would need a mount host that supports NFS/CIFS where the NAS volume can be mounted. Note that a given volume from NAS device must be mounted on the same host and must use the same mount point for all the backup runs to take advantage of NetBackup Accelerator.
This mount host can be a NetBackup client or NetBackup media server. NetBackup 5200 series appliances can also function as the mount host.
You also have the ability to scale-out NAS protection using multiple mount hosts. If you have a large NAS device with multiple volumes, you can distribute the volumes across multiple mount hosts and perform backups concurrently to scale-out backup jobs.
Is NFS mount host or CIFS mount host better to make use of NetBackup Accelerator?
From NetBackup Accelerator’s perspective, it does not matter which protocol you use. The question really is whether the implementation server (on NAS device) and client (on mount host) are superior for NFS or CIFS. If your environment supports both protocols, I recommend experimenting both and decide what works better for you. Furthermore, there could be other site specific or environmental factors that may decide your choice on NFS or CIFS.
Is NetBackup for NDMP or NetBackup for Accelerator using a mount host better for protecting a NAS device?
To answer this question, we need to look at the fundamentals of NDMP. As you already know, backing up NAS devices by mounting NFS volumes on a backup server in early days was quite painful. Contrary to popular myth, tape drives are much faster than disk drives and a file system backup through NFS used to cause shoe shining.
NDMP protocol was originally developed so that control and data path from primary (NAS) to secondary (backup) storage could be separated and the NAS vendors can build their own data-streaming servers (NDMP data servers) within the device. For example, NetApp by default uses ‘dump’ format while Celerra uses ‘tar’. Each has its own pluses and minuses. Many UNIX/Linux admins would agree that dump is in general faster than tar. Tape drives can directly be attached to the NAS device (or attached to other systems where NDMP tape server function is implemented, e.g. NetBackup media server) while backup server owns the control and cataloging function.
The point here is that although NDMP is standard protocol used in NAS devices, its implementation varies from vendor to vendor. Hence it is difficult to make a blanket statement on performance stats for backing up NAS devices using NetBackup for NDMP vs. NetBackup Accelerator using NFS/CIFS.
Now let us talk about a specific benchmark we had done in house. Here we are comparing NDMP implementation in NetApp with NetBackup Accelerator based backup using a Linux NFS mount host. Take a look at graphical representation of the results below. The vertical axis is the number of minutes taken to finish the backup.
The Data set:
Average size of files = 529kB
Number of files = 1.7 million
Total workload ~= 900GB
The blue cones represent NetBackup for NDMP backups to a tape drive directly attached to the filer and red cones represent NetBackup for NDMP backups sent to a deduplication pool attached to a media server. These tests are given as baselines.
The green cones represent NetBackup Accelerator backups using an NFS mount.
Standard Disclaimer: This is an internal benchmark in Symantec labs. Your mileage may vary depending on your environment.
Now let us interpret these results.
Direct NDMP to tape
The first thing that you will notice is that NDMP backups to tape is the fastest when it comes to shipping entire workload onto backup storage (traditional full backup). This is not surprising because NetApp NDMP data server uses dump which does not have any overhead while moving data from its volumes, and tape is faster than SATA disks attached to deduplication pool.
Since we are on this topic, let us also remember that deduplication appliances vendors are not telling the truth if they claim that a VTL (virtual tape library) implementation will provide faster NDMP backups than those to physical tape drives. The NAS devices feature volumes created on top of high performance disks striped across multiple spindles. The read throughput from such a volume is not something matched by write performance of lower RPM SATA disks typically used for deduplication appliances.
Remote NDMP to NetBackup Deduplication Pool
You would notice that the first backup using remote NDMP to NetBackup Deduplication Pool takes longer to complete when compared to direct NDMP backup to tape. This is expected because of two reasons.
- The pipe between NetApp and media server is 1Gb Ethernet whereas the tape drive was served by a 4Gb FC connection
- As I mentioned earlier, low RPM SATA disks typically used for backup storage cannot match the high RPM production disks on a NAS device
There is improvement for remote NDMP to disk for incremental backups. This is because of the intelligent stream handler in NetBackup deduplication engine for NDMP, which reduces the amount of data to be written onto storage.
NetBackup Accelerator using NFS
For apple-to-apple comparison with remote NDMP, the media server acts as the NFS client and uses the same 1Gb Ethernet connection. The initial full backup using NFS does not get any ‘acceleration’ because there is no track log for the mount point yet. Even after a decade and half since NAS devices had gone mainstream, NFS by itself has the same challenges. A traditional backup using NFS takes much longer to complete. This further justifies the use of NDMP protocol even for modern data centers.
But NetBackup Accelerator changes the game and makes NFS usable. As you see in the 2nd and 3rd runs of NetBackup Accelerator based full backups, it takes much less time to finish a full backup. For 2% data change, it is possible to create a full backup 21x faster than traditional backup using NFS, 10x faster than remote NDMP backups and 6x faster than direct NDMP backups to tape. As NAS workloads typically involve large number files with low change rate, NetBackup Accelerator is a great solution if you implement this for the right workloads.
Of course, there are exceptions. The most important one is when a NAS device is serving a VMware vSphere datastore where virtual machine disk files are continually changing. The good news is that we have NetBackup Accelerator for VMware vSphere to meet such a workload.
NetBackup Accelerator using NFS/CIFS also solves for another problem normally present in NDMP backups; platform independence. As data streams from NDMP is vendor dependent, you cannot recover across platforms. For example, you cannot easily restore NDMP backups from EMC Celerra to NetApp Data ONTAP or vice versa. Thus, for agile data centers that prefer heterogeneity to avoid vendor lock-in, NetBackup Accelerator using NFS/CIFS can be a powerful migration tool.