The accelerator backup's behaviour for sparse file is different from the behaviour for normal file. Let's have a look what happed to sparse file.
First of all, let's simply take a look at what's sparse file.
What's sparse file?
A sparse file is a type of computer file, it has an apparent size which is larger than the amount of storage actually allocated to them. The usual way to create such a file is to seek past its end and write some new data, Unix-derived systems will traditionally not allocate disk blocks for the portion of the file past the previous end which was skipped over. The result is a “hole”, a piece of the file which logically exists, but which is not represented on disk. A read operation on a hole succeeds, with the returned data being all zeros.
How NetBackup deal with sparse file?
Relatively smart file archival and backup utilities will recognize holes in files, these holes are not stored in the resulting archive and will not be filled if the file is restored from that archive.
The way NetBackup deals with sparse file is similar to the way above.
NetBackup will identify the holes in sparse files, these holes are not stored in the backup images. Even if there is real zero data filled in the sparse file, NetBackup still considers it as holes, so these real zero data are not stored in the backup images.
NetBackup Accelerator backup for sparse file?
To take an example to describe the behaviour.
- Create a sparse file.
HOSTNAME:/walker # dd if=/dev/null of=spars-file1 bs=1k seek=2097152 count=1
0+0 records in
0+0 records out
0 bytes (0 B) copied, 1.4823e-05 s, 0.0 kB/s
HOSTNAME:/walker # ls -ls spars-file1
0 -rw-r--r-- 1 root root 2147483648 Nov 27 12:46 spars-file1
- Fill the sparse file with zero data.
HOSTNAME:/walker # dd if=/dev/zero of=spars-file1 bs=1k count=512000 conv=notrunc
512000+0 records in
512000+0 records out
524288000 bytes (524 MB) copied, 1.3924 s, 377 MB/s
HOSTNAME:/walker # ls -ls spars-file1
512504 -rw-r--r-- 1 root root 2147483648 Nov 27 12:46 spars-file1
Inode: 2346845 Type: regular Mode: 0644 Flags: 0x0
Generation: 3652504436 Version: 0x00000000
User: 0 Group: 0 Size: 2147483648
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 1025008
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x50b445af -- Tue Nov 27 12:46:39 2012
atime: 0x50b44588 -- Tue Nov 27 12:46:00 2012
mtime: 0x50b445af -- Tue Nov 27 12:46:39 2012
Size of extra inode fields: 4
(0-11):9496038-9496049, (IND):9496050, (12-1035):9496051-9497074, (DIND):9497075, (IND):9497076, (1036-2059):9497077-9498100,
(125964-126987):9777086-9778109, (IND):9778110, (126988-127999):9778111-9779122
The apparent size is 2147483648 bytes, the filesystem allocates 128126 data blocks to the sparse file, so the physical size is 512504 KB.
- Start first full backup with accelerator.
Only about 2560 bytes data was sent to server. From the job details.
info bpbkar (pid=466958) accelerator sent 2560 bytes out of 2560 bytes to server, optimation 0.0%
From the bpbkar log, we also see the size of data sent to server.
bpbkar main: JBD - accelerator sent 2560 bytes out of 2560 bytes to server, optimization 0.0%
Although the sparse file occupies about 500 MB disk space, only about 2560 bytes data are sent to server and stored in the backup image. What happed?
As we stated above, NetBackup will consider the zeros data as hole, and the holes are not stored in backup image. Here NetBackup considers the 500MB data as hole, the 500MB data is not stored in the backup image, so only little data is stored in the backup image.
When backing up the sparse file with accelerator for the second time, there are two situations.
1) If the sparse file hasn't been change since the last full backup, it will speed up the backup, the optimization rate is about up to 99%.
2) If the sparse file has been changed since the last full backup, it will not speed up the backup, and backup all the whole sparse file, that is it will sent all the amount of the sparse file to server, not the changed data.