Once ransomware breaks through an organization’s defenses, time is of the essence, and IT must execute 5 steps to rapid ransomware recovery. The need for rapid recovery and minimal data loss was the top concern of 75% of the IT professionals responding to the survey we conducted during our recent webinar, “Creating a Holistic Ransomware Recovery Strategy,” now available on-demand.
There are 5 steps to rapid ransomware recovery with minimal data loss:
Step | Reason |
---|---|
Frequent Protection | Ransomware can strike at any moment, protection copies should be made, at least every few hours. |
Long Retention | Some ransomware variants strike slowly to avoid detection. Recovery may require pulling data from multiple backup copies. |
Rapid Alerting | The sooner you can detect you are under attack, the sooner you can stop the attack at its source and limit the damage |
Mount Don’t Restore | Traditional restoration means copying data from an alternate storage medium, which takes time. |
Practice, Practice, Practice | Ransomware recovery is unlike any other. Find a safe way to “infect” your data center and practice. |
Rapid Ransomware Recovery Step 1: Frequent Protection
While it may seem the most obvious of the 5 steps to ransomware recovery, it is missing from most response strategies. In an ideal ransomware protection scheme, protection events should occur every hour but at least every three hours. This necessary frequency of protection creates a challenge for many data protection approaches.
For example, most snapshot technologies, especially VMware’s built-in snapshots, will degrade performance significantly if the number of managed snapshots grows beyond a handful. However, even dedicated storage systems like all-flash arrays struggle when managing many snapshots. They may perform acceptably but can’t manage a sophisticated retention schedule. The intricacies of the snapshot metadata make deleting a snapshot, which is what a retention schedule does, egregiously slow. Because of its high metadata overhead, it takes the storage system time to “unwind” an intermixed snapshot, and its deletion means updating the metadata for all other snapshots. One result of this is that snapshots consume far more capacity than they should because they are so slow to give back the space used by old snapshots.
For these reasons, most organizations can’t tap into the full theoretical potential of ideal snapshot technology and, as a result, must count on backup and recovery solutions that significantly increase costs and slow recovery efforts.
Frequent Protection with VergeOS
VergeOS is different. At the core of VergeOS is global inline deduplication. Because VergeIO started with deduplication instead of bolting it on years after shipping a product, it delivers maximum data efficiency without impacting performance. Our IOclone capability leverages global deduplication to enable the creation of full clones of virtual machine data or even entire data centers in milliseconds. These clones are space efficient and independent of each other. You can have thousands of them without impacting performance. More importantly, you can delete them, even via a sophisticated retention schedule, in seconds, meaning any space they consume is instantly returned to the environment.
Rapid Ransomware Recovery Step 2: Long-Term Retention
Ransomware can take two attack vectors. The most common is, it will try to encrypt every file it can get to as soon as it breaks into the environment. The second attack vector is more sophisticated, slowly encrypting data to avoid detection. While the second vector is more sinister, most Bad Actors don’t have the patience to let the malware sit and slowly encrypt for months. They want the money now! Frankly, given the success rate of attacks once landing their malware payload, they don’t have to be sophisticated.
While the second attack vector is not as expected, it is wise to prepare for it. Long-term and granular data retention is the key to recovering from a slow-crawl ransomware attack. Again, because of performance concerns, snapshots are unsuitable for long-term retention in most cases. Backup software is excellent at the long-term recovery aspect but, because of the infrequency mentioned above, cannot provide a lot of granularity.
Solving the Retention Problem with VergeOS
Once again, VergeOS’ IOclone provides an ideal solution for long-term data retention, providing complete clones which are independent of each other. Retaining thousands of them doesn’t impact performance, and you can maintain as granular a history as you feel necessary. Getting rid of old files is another important step in limiting ransomware damage.
As mentioned, you can develop a sophisticated retention schedule to meet these requirements. For example, you can execute hourly clones and retain each for 24 hours. You can then execute a daily clone and retain that for seven days and a weekly clone that you retain for two months, and a monthly clone for a year. This type of schedule means a lot of deletion of older copies to reclaim space. It would cause significant performance problems for traditional snapshot techniques and take weeks to return the capacity reserved by those snapshots. IOclone has no performance impact, and reserved capacity is returned almost instantly.
Rapid Ransomware Recovery Step 3: Rapid Alerting
Knowing you are under attack is a critical part of 5 Steps to Rapid Ransomware Recovery because it addresses the other part of IT concerns, “with minimal data loss.” The sooner you know your environment is under attack, the sooner you can shut down the virtual machine under attack and limit the spread. The early warning also enables IT to better identify which protected copy they should turn to when starting their data recovery.
A few storage systems will provide an alert of a potential ransomware attack. Most of these will monitor for an increase in capacity utilization. The problem is that these alerting methods often miss an attack because capacity doesn’t necessarily grow. When malware works through your environment, it typically encrypts one file at a time, and during encrypting, those files will increase in size. After encryption, the file will be almost the same size as the unencrypted file. In other words, these methods will miss the attack. You’d much rather have a false positive than a missed attack.
IOfortify Delivers Reliable Attack Alerting
VergeOS’ IOfortify capability delivers reliable attack alerting by monitoring a change in deduplication ratios instead of changes in capacity utilization which is far more accurate. Encryption may not increase capacity utilization, but those files will look like new files to a deduplication algorithm. During our “Creating a Holistic Ransomware Recovery Strategy”, we demonstrated IOfortify, first identifying and alerting, then recovering a virtual machine whose data was actively being encrypted, in real time.
Rapid Ransomware Recovery Step 4: Mount, Don’t Restore
Mounting your recovery means pointing directly to your protected copy without having to move data. Restoring means copying the data from where it is back to the production volume, which can take dozens of minutes, if not hours, depending on the size of the volume and bandwidth of the network.
Again historically, the problem with directly mounting your recovery volume is how you maintain those copies. A traditional complete clone will consume too much capacity and take too long to create to be practical and violate the other above steps. A traditional snapshot still depends on the original volume; promoting it to production may mean a complete copy/restore.
Some backup solutions have an “instant recovery” solution. The problem with this method is that while you are mounting a volume, you are mounting it from a backup storage target which typically doesn’t have the performance or availability capabilities of production storage.
IOclone instant recovery with no performance impact
IOclone enables IT to point directly at a version of the virtual machine or data center before the ransomware attack. It is online instantly, and because of its independence, it does not need to be “rolled back” to production.
Rapid Ransomware Recovery Step 5: Practice
Ransomware recovery is unlike any other, so IT must practice the recovery process. The problem with practice is risking a “leak” of the practice into production.
Virtual Data Centers Make for Perfect Practice
VergeOS’ Virtual Data Center (VDC) capabilities enable IT to create a complete, secure copy of their entire data center and “infect” it with a ransomware simulator or an encryption program. Their isolation ensures the practice attack doesn’t “leak” into production. Verge.IO even has some customers that put their VDC, with anonymized data, out as a publicly addressable honeypot so they can test their attack response against a real foe.
Conclusion
The 5 Steps to Rapid Ransomware Recovery require preplanning, and they also require better infrastructure software. Because of the “bolt-on” approach to all features and protection capabilities, platforms like VMware can’t provide the same level of protection as VergeOS. The good news is you can transition from VMware to VergeOS seamlessly and at your own pace. You’ll have a more resilient environment and reduced costs by 50% or more. To learn more about using VergeOS as a VMware exit ramp, read our VMware Alternative page. You can also start using VergeOS as a Disaster Recovery solution, including for ransomware recovery, for VMware without migration using our IOprotect capability.