Snapshots or Clones for Data Protection?

By George Crump

snapshots or clones for data protection

Most storage solutions will provide IT professionals with either snapshots or clones for data protection, but are the differences between the two functions significant enough to make it part of your selection criteria? Like all things in IT, the answer depends.  In this case, it depends on if and how your vendor implemented the two technologies. 

Register now to join us live on May 4th for technical deep dive into virtual infrastructure file systems and see a live demonstration of IOprotect.

What Are Snapshots?

Deciding if snapshots or clones are best for data protection first requires understanding how the two technologies work. First, let’s look at snapshots. Most storage solutions, be they a filesystem or block storage, have a metadata layer that points to where each data segment resides. A snapshot makes a copy of those pointers at a specific time and then sets those blocks pointing to a read-only mode until it expires. 

Snapshot Update Methods

There are two methods for updating a read-only segment because of a snapshot.  The first method is a copy-on-write process. When a user or application attempts to update or change an existing segment, the storage solution copies the old segment to a new location and allows the new data to occupy the original segment. The storage solution then updates the snapshot metadata with the old segment’s new location. 

The second snapshot update method is “redirect on write”. Using this method, the storage system will write the modified data to a new location and update the metadata of the “production view” of the data. It does not need to update the “snapshot view” of the data. 

Both of these methods limit the scalability of snapshots because multiple writes and multiple changes to metadata need to occur. Also, many storage systems use separate metadata trees to manage each snapshot. As the number of snapshots increases and the depth of those snapshots (snapshots of snapshots), the complexity of managing and updating the metadata wears on system performance.  As a result, where the snapshot is occurring within the hypervisor, on the same hardware as the hypervisor (software-defined storage running as a virtual machine), or on dedicated storage hardware, there are limits to how many snapshots the storage solution can maintain. 

The complexity shows itself by degrading overall system performance. Storage systems with legacy snapshot technology require:

  • Limitations to number of copies retained
  • High-end processors in the storage servers
  • Dedicated data processing units (DPUs)
  • Days to remove old snapshots

What Are Clones?

Clones are copies of existing segments. They are more standalone, and updating a clone does not require the same metadata overhead as snapshots. The independence of a clone means that they don’t suffer from performance degradation as snapshots regardless of how many there are or how long they are retained. Clones don’t need either of the sophisticated update methods that snapshots require.

The Downside to Clones

The typical downside to clones is that they are either complete copies of the original volume or deduplicated copies. A full copy, means that data must traverse the internals of the storage infrastructure, travel across the network to the hypervisor, and back down the network again to the storage system. 

Some hypervisors have initiated capabilities to eliminate traversing the network, saving time. Still, most cloning functions must process data through the internals of the storage solution twice, even if that solution has a deduplication feature. With deduplication, the resulting clone may not consume any additional capacity, but the time to create that copy is still significant, especially if the volume is of any measurable size. It is best not to use the applications while the storage solution clones it. As a result, most organizations don’t use cloning as part of their data protection strategy. 

IOclone — The Best of Clones and Snapshots

As we’ve discussed, clones and snapshots typically have some overhead in the capacity they consume and the processing required to use them. Clones typically have to make a copy of all of the metadata information, which means the cloning process takes some time upfront, but then they are ready to use and independent. Snapshots don’t have the upfront processing time and, as a result, are ready for use almost instantly. However, they show performance degradation as the number of snapshots increases when used or during snapshot clean-up routines. 

IOclone is a capability of the VergeOS operating system that combines the best of clones and snapshots into a single solution. Since global inline deduplication is part of the metadata in VergeOS, IOclone, copies are similar to snapshots. Regardless of capacity, it can create clones of VMs, volumes, or entire virtual data centers in milliseconds. At the same time, IOclone-created copies have the stand-alone performance of independent clones without initially consuming additional capacity footprint.

With IOclone, IT doesn’t have to choose between snapshots or clones for data protection. This capability within VergeOS can retain hundreds, even thousands of copies of VMs, volumes, or entire Virtual Data Centers (VDC) without negatively impacting performance or capacity consumption.

Learn More

snapshots or clones for data protection

Conclusion

IOclone is also part of our IOprotect solution, which enables you to start a VMware Exit by first using VergeOS as a disaster recovery solution. Most customers find IOprotect reduces the cost of disaster recovery by more than 50% without adding additional hardware. It provides a complete recovery environment, converging disaster recovery so that data, applications, and the processing power to recover are all available from a small cluster of nodes. 

As your confidence in VergeOS grows, you can use it for your production environment. The tightly integrated VergeOS architecture delivers more efficient performance, increasing workload density on less physical hardware. Once your conversion is complete, you’ll lower costs by as much as 80% and enjoy an actively developed data center operating system with unparalleled support.

snapshots or clones for data protection

Further Reading

The Media and Node Flexibility of Ultraconverged

Learn how the media and node flexibility of ultraconverged infrastructure (UCI) optimizes storage and compute resources, enabling efficient scaling and significant cost savings for diverse workloads.”
Read More

Why Do Data Centers Still Rely on Dual-Processor Servers?

Data centers have come a long way since the early days of server infrastructure, but one question remains: Why do most data centers still rely on dual-processor servers with 16 or 32 cores despite the availability of quad-processor servers? Quad-processor systems, after all, offer significant advantages like reduced server count, lower total costs, and decreased […]
Read More

StorageReview VergeIO Lab Results

Read about StorageReview VergeIO Lab Results which show VergeOS as a high performance alternative to VMware. Learn how VergeIO plows through VM bootstorms, offers superior write performance, and cost savings
Read More