
What is deduplication?
Deduplication is the process of “unduplicating” data. The term deduplication was coined by database administrators many years ago as a way of describing the process of removing duplicate database records after two databases have been merged.
In the context of disk storage, deduplication refers to any algorithm that searches for duplicate data objects, such as blocks, chunks, or files, and discards these duplicates. When a duplicate object is detected, its reference pointers are modified so that the object can still be located and retrieved, but it “shares” its physical location with other identical objects. This data sharing is the foundation of all types of data deduplication.
Why deduplication?
Data deduplication is popular in disk storage today because it reduces the amount of disk space needed to store data. The average UNIX® or Windows® enterprise disk volume contains thousands or even millions of duplicate data objects. As these objects are modified, distributed, backed up, and archived, the duplicate data objects are stored repeatedly. The end result of this is inefficient use of storage resources. Deduplication helps to prevent this inefficiency.
Delivering Significant Storage Space savings – up to 90% – for VMWare Environments
NetAp deduplication is providing sifnificant space saving for customers providing a complement to VMWare environments. By leveraging NetApp deduplication for VMWare Infrastructure, customers are achieving up to 90% space savings to store their VMWare virtual machines.
VMware Infrastructure complements NetApp’s deduplication technology. An example of this is VMware Virtual Desktop Infrastructure (VDI), which enables organizations to consolidate desktops onto server hardware, improving desktop management and reducing support costs. NetApp’s data deduplication technology reduces the amount of physical storage needed in a consolidated desktop environment, providing a cost-effective solution that further enhances the robust products included in VMware Infrastructure such as VMware High availability (HA), VMotion, Consolidated Backup and Distributed Resource Scheduler (DRS). NetApp and VMware provide customers with a unique technology match to achieve great efficiency in their data centers. Customers that are virtualizing their infrastructures are especially well-positioned to take full advantage of NetApp’s complementary space-saving technologies such as deduplication.
What the customer says: Virginia Credit Union
Among its many benefits to NetApp customers’ businesses, deduplication complements VMware environments. NetApp customer Virginia Credit Union is a not-for-profit financial cooperative that serves more than 180,000 members. The cooperative deployed VMware Infrastructure to consolidate servers and quickly create clone copies for test and development. However, the Virginia Credit Union sought greater storage efficiency.
By leveraging NetApp to deduplicate VMware virtual machines, Virginia Credit Union achieved overall space savings of 78%. NetApp deduplication was the most suitable technology addressing Virginia Credit Union’s storage capacity problem. NetApp deduplication provided the means to retain long-term business-critical data in compliance with financial regulations in a cost-effective and efficient manner. In addition, by leveraging other NetApp space-saving technologies such as thin provisioning and FlexClone®, Virginia Credit Union further solved its business need for rapid provisioning of server and data cloning for testing and development purposes without consuming disk space.
“NetApp added a layer of storage intelligence that was missing in our VMware environment,” said Rich Barlow, systems architect with Virginia Credit Union. “The combination of virtual machines on NetApp provides us rapid disaster recovery and near instant provisioning in addition to cutting edge, innovative storage management technologies that helped fulfill our core business objective, which is to better serve our members. No storage vendor could provide the magic combination of NetApp space-saving technologies to control our spiraling data consumption, dramatically reduce the amount of storage we use and manage, and shrink the overhead needed to store large amounts of production virtual machines. NetApp deduplication transparently reduced our physical storage space without sacrificing our performance. We’re getting our money’s worth out of NetApp.”
NetApp’s Deduplication solutions
Deduplication is an integral part of the NetApp WAFL® file system, which manages all storage on NetApp FAS systems. As a result, deduplication works behind the scenes regardless of what applications a company runs or how they access their data. NetApp deduplication, combined with support from leading backup and data protection partners, provides customers advanced storage space savings to dramatically improve and simplify their business.

NetApp deduplication technology is an inherent feature in all NetApp storage systems. As enterprises struggle with managing multiple copies of data across their infrastructures, NetApp eliminates redundant data objects to improve space-savings efficiency and reduce physical storage capacity that customers need to purchase and manage for data copy activities. Competing deduplication products simply provide a point solution that narrowly addresses space reduction for data backup environments. Only NetApp delivers deduplication technology that serves primary storage and is integrated into its mainline storage operating system, Data ONTAP®, at no charge to customers. In addition, NetApp helps customers incorporate deduplication into a wide variety of other environments, including backup data and archival data – all without sacrificing performance.
Summary
Data deduplication is an important new technology that is quickly being embraced by users as they struggle to control data proliferation. By eliminating redundant data objects, an immediate benefit is obtained through space efficiencies. When choosing a deduplication product, however, it is important to consider all aspects of design, including hashing, indexing, inline or postprocessing, source or destination, and of course space savings efficiency.
Many vendors currently offer deduplication, with more sure to follow, all with various approaches and techniques. It is clear that data deduplication will someday be a requirement for every storage vendor, much as snapshots became a requirement years ago.
Well-designed deduplication must perform without compromising data integrity and reliability. Deduplication magnifies the effect of data corruption. If a deduplicated data object becomes corrupt, it has far-reaching implications, because it is referenced by many other files and applications. Vendors will be required to provide 100% assurance that their design will prevent any such data inconsistencies
Deduplication must operate seamlessly in existing user environments. Users will not build a storage infrastructure around deduplication; rather, deduplication must fit into their existing environment with minimal disruption. Ultimately, deduplication must be a transparent background process.
Finally, deduplication will be required to have minimal impact on system performance. Users will not implement deduplication if it has a negative impact on their system workloads. This is particularly true as deduplication makes its way from backup applications to more performance-sensitive primary storage environments.
NetApp, a leader in data storage efficiency since 1992, has established A-SIS deduplication as the first deduplication product to be used broadly across many applications, including data backup, data archival and primary data. A-SIS deduplication combines the benefits of granularity, performance, and resiliency to provide users with significant data deduplication benefits.
![]()
NetApp deduplication solutions save you money, space and time. And you can compute your actual savings online right now. Go to http://www.dedupecalc.com/ to get our exclusive recommendations for optimizing your data center.
NetApp EMEA HQ
Boeing Avenue 300
1119 PZ Schiphol Rijk
The Netherlands
Tel: +31 20 503 9600
For all other EMEA locations or further information, please check www.netapp.com.