Sunday, June 24, 2012

NetApp Deduplication

 Netapp supports deduplication where only unique blocks in the flex volume is stored and it creates a small amount of additional metadata in the de-dup process. The NetApp deduplication technology allows duplicate 4KB blocks anywhere in the flexible volume to be deleted and stores a unique one.
The core enabling technology of deduplication is fingerprints. These are unique digital signatures for every 4KB data block in the flexible volume.
 When deduplication runs for the first time on a flexible volume with existing data, it scans the blocks in the flexible volume and creates a fingerprint database, which contains a sorted list of all fingerprints for used blocks in the flexible volume. After the fingerprint file is created, fingerprints are checked for duplicates and if found, first a byte-by-byte comparison of the blocks is done to make sure that the blocks are indeed identical. If they are found to be identical, the block’s pointer is updated to the already existing data block and the duplicate data block is released and inode is updated.
The below link will make you to understand the detailed information about how de-dup works - Source NetApp Community

 https://communities.netapp.com/community/netapp-blogs/drdedupe/blog/2010/04/07/how-netapp-deduplication-works--a-primer
Deduplication refers to the elimination of redundant data in the storage. In the deduplication process, duplicate data is deleted, leaving only one copy of the data to be stored. However, indexing of all data is still retained should that data ever be required. Deduplication is able to reduce the required storage capacity since only the unique data is stored. 


1. Enable dedup (asis) license:

filer> license add <Code for NearStore>
filer> license add <Code for De-dup>
filer> sis on /vol/bali

2. If you have a new flex volume which was just created, follow this step to enable ASIS deduplication

filer> sis on /vol/bali
Deduplication for "/vol/bali" is enabled.
Already existing data could be processed by running "sis start -s /vol/bali”


3. If you have already existing flex volume with data in it, follow this step.

filer> sis start -s /vol/bali

4. If you have already existing flex volume with data in it, follow this step.

filer> vol status bali
Volume          State   Status          Options
bali      online  raid_dp, flex   nosnap=on
                        sis
Containing aggregate: 'aggrSATA'

5. Check de-dupe status

filer> sis status /vol/bali
Path            State   Status      Progress
/vol/demovol    Enabled Idle        Idle for 00:02:12

6. Check the storage space saved due to deduplication
filer> df -s /vol/bali
Filesystem      used    saved   %saved
/vol/bali/   9316052 0       0%

7. If you have to run de-dupe at a later point of time on this volume, use “sis status /vol/bali”

8. Deduplication can be scheduled by using "sis config" with below options
sis config [ [ [ -s schedule ] | [ -m minimum_blocks_shared ] ] <path> ...]
        - Sets up, modifies and retrieves schedule and minimum blocks shared
         value of SIS volumes
9.  If you wish to stop and delete the de-dupe entries on a specific volume, follow the below steps

filer> sis stop /vol/bali
Operation is currently idle: /vol/bali
filer> sis off /vol/bali
SIS for "/vol/bali" is disabled.
filer> priv set advanced
Warning: These advanced commands are potentially dangerous; use
them only when directed to do so by Network Appliance
personnel.
filer*> sis undo /vol/bali
filer*> priv setfiler> sis status /vol/bali
Disabled Undoing 6810 MB Processed
filer> sis status /vol/bali
No status entry found.

No comments:

Post a Comment

Hello

Hello