The performance impact of retention means that VMware snapshots have a high cost, which further means that IT professionals must compensate by investing more than they should into storage and backup infrastructure. Below are the best practices of VMware’s snapshot functionality, according to VMware’s knowledge base article:
- Don’t Use Snapshots As Backups
- While the maximum number of supported snapshots per virtual machine is 32, the best practice is not to use more than 2 or 3.
- Don’t retain a snapshot for more than 72 hours.
- Ensure that snapshots are deleted when using third-party backup software
- Never increase the virtual machine disk size while there are active snapshots.
The weaknesses of VMware’s Snapshots are just one of the hidden costs of using VMware in the data center. To learn all four, watch our on-demand webinar, “The 4 Hidden Costs of VMware“.
VMware Snapshots Are Not Backups
VMware states the reason VMware snapshots should not be considered backups of virtual machines (VMs) is “The snapshot file is only a change log of the original virtual disk. It creates a placeholder disk, virtual_machine-00000x-delta.vmdk, to store data changes since the snapshot was created. If the base disks are deleted, the snapshot files are insufficient to restore a virtual machine.”
The need to track changes in a separate file means that every time new data is written to a VM’s primary volume, it leads to significant overhead and dependency on the original volume. The overhead limits customers’ ability to use VMware’s snapshot technology for backup because only two or three snapshots can be active at any point in time. The dependency is the final nail in the coffin. If the primary fails, then all of your snapshots become useless.
The Impact of VMware Snapshots Not Being Backups
Most customers would still choose a separate backup software solution even if VMware could provide unlimited snapshots without performance impact. The fact that VMware snapshots are so hindered forces customers to invest more heavily in a backup solution. The weakness of VMware’s data protection capabilities has led to the creation of companies like Veeam and fueled its growth.
Backup solutions are the only products that can extract any usefulness from VMware Snapshots. They can execute one VMware snapshot, mount it to their backup application, and back it up. Then when the backup completes, the software can delete the snapshot it took so it doesn’t impact overall performance. That same knowledge base article advises IT to make sure their backup software selection can delete the snapshots it takes. (Item 4 above)
The Cost of VMware Snapshots Not Being Backups
VMware’s deficient snapshot capability is not unique. Although not as severe, many dedicated storage systems have similar limitations on how frequently you can execute a snapshot and how long you can retain those snapshots. All of the legacy snapshot technologies are plagued with this problem. Each successive snapshot depends on the snapshot before it, and all snapshots depend on the original volume. If that original volume is removed, all the snapshots are invalidated.
One of the most important priorities for IT is to protect the digital assets that the organization creates and uses to make decisions. Since that priority is paramount, IT must work around the weakness in snapshot technology and invest in a separate process, backup, and recovery, to mitigate the risk.
The investment in the backup and recovery process is not insignificant. There is the cost of the backup software and the need for and cost of a separate storage system. There is also the cost of time to transfer that data from production storage to the secondary storage device. The transfer time means significant gaps in which data is unprotected, something ransomware uses to its advantage. Finally, there is also the time involved in transferring data back into production if something goes wrong with primary storage. There is a place for separate backup and recovery, but it should not be the primary means to protect and recover production data.
IT professionals largely ignore the cost impact of these limitations because they assume that there is no alternative.
The Clone Alternative to VMware Snapshots
As discussed in our previous article, “Snapshots or Clones for Data Protection”, a Clone, i.e., a complete copy of a virtual machine or volume, except for one limitation, is a much better means to protect data:
- Clones are independent
- Clones don’t impact performance
- Clones can be retained indefinitely
A limitation of clones is that they are exact copies of the original, which means there is a transfer time problem and a capacity consumption issue. This limitation goes away, though, if, at an infrastructure level, global inline deduplication is integrated into the core code. Global inline deduplication enables the creation of copies of any virtual machine, or even the entire environment. The clones can be made near instantly, and they, initially, don’t consume any capacity.
The problem is that most deduplication technology is an afterthought, especially within hypervisor software. VMware introduced deduplication into vSAN years after the initial release, and Nutanix waited even longer. Adding deduplication as a bolt-on years after the initial introduction means that the algorithm adds processing overhead to the environment, dramatically impacting performance and decreasing virtual machine density.
To some extent, IT can work around the overhead of deduplication by buying more powerful servers and adding more RAM to those servers, all of which add significant costs to the infrastructure. Alternatively, IT can purchase a dedicated storage array. Still, as we explain in our article, “The High Cost of Dedicated Storage,”, that approach also increases the cost of the infrastructure.
IOclone: Eliminating Costs While Increasing Resiliency
VergeIO integrates deduplication into VergeOS, and it isn’t a bolt-on. Global Inline Deduplication has been at the core of VergeOS since day one. As a result, it operates very efficiently and with no performance impact compared to legacy solutions. This means creating a clone using VergeOS’ IOclone capability; it happens instantly with virtually no initial impact on capacity. Also, these clones are not dependent on the original copy. They are standalone and don’t impact performance, nor do they have retention limitations.
VergeOS’ Global Inline Deduplication is also WAN aware, so IT can replicate production data and clones to remote DR sites or other data centers using minimal bandwidth and time. Moving data to a second VergeOS instance meets the “one copy off-site” requirement common in most data protection strategies.
Thanks to VergeOS’ foundational implementation of global inline deduplication, IOclone merges the best of both snapshots and clones to deliver unprecedented data protection and resilience. It is also why we refer to them as snapshots within our GUI. It is another example of the benefits of solving problems holistically at the infrastructure level instead of myopically at the data level. Watch our on-demand webinar, “Creating a Holistic Ransomware Response,” to see another example of solving problems at an infrastructure level.
While some VergeIO customers have eliminated backup as a separate process, you still may want to continue with your backup and recovery strategy, which VergeOS supports. Even if you do, the sophistication and expenses of that process are significantly reduced. While VMware snapshots have a high cost, IOclone does not. It is part of the reason customers who select VergeOS as their VMware alternative realize a reduction in the total cost of ownership by as much as 80% in addition to 30% or larger upfront licensing savings.