Need an Air Gap? Call a Plumber

Recommendations for a Secure Storage Project

I have spent the last several years of my career attempting to alert my fellow sumos, partners and customers about the growing cybersecurity threat known as ransomware (see previous post here). Most people (certainly in the cybersecurity and information technology space) now understand the threat better, and some have even taken precautions, updating their security appliances, educating their employees and re-evaluating their backup systems.

I have also waxed poetic that an organization’s backup systems could be cause for concern, since criminals often seek out and destroy an org’s backup data first, to eliminate any possibility of recovery and thus increasing the likelihood of having to pay ransom, typically in untraceable bitcoin.

Recognizing this risk, some organizations have even gone so far as to evaluate the need for a supplemental, redundant backup system that would serve as a sort of “secure enclave”, should the primary data and backup systems also be compromised.

While this is the right idea, I have noticed some troubling trends regarding strategies for improving cybersecurity posture with a separate, redundant backup system. I have also noticed a general misunderstanding regarding terms and nomenclature for this wild west of cybersecurity. This post is designed to explain the most important terms to know, what to look for in a secure, tertiary storage environment and some recommendations for evaluation criteria.

Talk Like an Expert- The Terms to Know

Air Gap

An Air Gap. Useful for passing a health inspection.

An air gap is a plumbing term, referring to the unobstructed vertical space between the water outlet (like a faucet) and the flood level of a fixture (like a kitchen sink). This provides backflow safety, which protects the water source from contamination. “Air gap” is NOT a useful term regarding secure data access and storage.

Unless the data is stored on removable media (like tape or CD) and stored offline on a shelf, there is no air gap. Furthermore, a true air-gapped computer system also wouldn’t be very useful for recovery, testing or patch management. Any product that is network-attached cannot be “air-gapped”, and even vendors that have adopted this term are still careful to call it an “operational” air gap, which is acknowledging that it is certainly not an air gap.  Remember, if you need an air gap, call a plumber. Typically when customers are referring to an air gap solution, they actually mean a strong set of security features that I will describe below in greater detail. So what terms should we be using?

Immutable


FlashArray Storage Snapshots

Now we’re getting somewhere! Immutable storage means that the data cannot be altered, updated or changed in any way. Storage snapshots are a superb example of immutable data storage. Snapshots create a frozen copy of data that is impervious to change. This typically provides a DVR capability to revert back to a point in time before ransomware encrypted the data. Please note that network storage appliances (that are not protected with snapshots) are NOT immutable. Conceivably, the files could be opened and changed. More importantly, even immutable storage can still be destroyed and eradicated, which is where WO/RM comes in⇣

Write Once, Read Many (WO/RM)

WO/RM storage describes an indestructible quality that means data cannot be overwritten, deleted or removed by any user, even a mighty administrator. WO/RM storage has been available for decades dating back to ROM (read-only memory). Tape systems also offered WO/RM varieties and many of you fondly remember CD-R and DVD-R, which allowed for a single-write operation, then prevented any further overwrites to that optical disk. Now that network-attached storage is preferred for backup applications, WO/RM is not a common standard feature found on most appliances but make no mistake; WO/RM is a CRITICAL feature to demand in any highly secure application, such as a secure, hacker-proof storage environment.

Legal Hold

Legal hold is a notification sent from an organization’s legal team to an IT team (and probably relevant employees), instructing them not to delete electronically stored information. Similar to WO/RM, legal hold requires that data be preserved in a tamper-proof and indestructible way. Legal hold differs from WO/RM, in that a legal hold request typically requires the preservation of data to be applied retroactively. To guarantee extended retention of data, legal hold must be applied to an individual’s or organization’s data, often for an indefinite period of time.

Multifactor Authentication (MFA)

MFA is an authentication system that requires more than one distinct authentication factor for successful authentication, typically to gain access to a secure management system. MFA can be implemented through an authentication platform (such as Okta or Duo), or can be implemented with local credentials, and authenticated through an SMS text code, or using a popular authentication app, such as Google’s “Authenticator”. MFA is quite possibly the single most important safeguard against unwanted access to critical applications and security systems.

Now that we know the game, let’s play. Below are suggested starter evaluation criteria for anyone evaluating a secure storage solution ⇣

Example Evaluation Criteria

Recommendations

We recommend that customers looking for secure storage solutions for protection against ransomware should research both modern on-premises and cloud-based storage systems that incorporate a combination of immutable architecture and WO/RM (write-once, read many) technology. Together, these indestructible features create a bedrock of data that cannot be compromised by external threats such as ransomware, or even internal threats such as sabotage.

These devices absolutely must also use strong access controls with multifactor authentication (MFA), preferably with separate credentials from the primary domain (such as Microsoft Active Directory). Ideally, select a system that provides local user authentication with multifactor support.

In this ultra-secure application, we want as much risk isolation as possible, which includes separation of hardware and software development cycles. Another recommendation is to consider only products with entirely different hardware and software from what is currently used for primary storage and backup. One of our customers told us that, during a routine service event, their current vendor’s service technician accidentally reformatted the wrong backup storage system, causing data loss and an outage. While this was unintentional, this event caused the customer to research secure storage for protection against risks like ransomware and sabotage, and the customer evaluated only storage products and vendors that were different from their current provider, in order to minimize the risk of exposure to their secure environment.

Finally, we strongly recommend evaluating only those systems that provide high-performance to be able to meet more demanding SLAs for recovery. Ransomware typically inflicts maximum pain by encrypting as much data as possible, which would require quick recovery of potentially all your data. Pre-ransomware era backup and storage technologies are typically based on slower, low-cost components. We recommend all-flash technologies that can perform at-scale, and allow for easy testing. A system that can perform a near instantaneous restore of all data can also account for the contingency that even the primary storage is unavailable, and thus run indefinitely, until the compromised storage is back online.

Better yet, start to demand these secure storage features in your primary storage and backup systems, and reduce the need to rely on such ultra-secure redundant devices in the first place.

The Challenge of Ransomware Demands a “Cohesive” Approach

Finding out your data center has been infected by ransomware is sort of like finding out your mother has been dating Mötley Crüe rock drummer Tommy Lee— You know some terrible things have already happened and you’re going to have a mess to clean up. It might seem like something that happens only to other people but statistically it is likely to happen to you. StorageSumo is here to help.

The problem is so much worse than the average consumer knows. Let’s peel back some layers of this stinky onion and discuss how modern backups with Cohesity can give your IT staff an unfair advantage against cyberthreats like ransomware.

I’m going to address some basics including:

  • What ransomware is
  • What the true cost is to you and your organization
  • How ransomware enters your datacenter
  • How to prevent, detect and recover from a ransomware attack.

Sadly, very little is known about this particularly insidious form of cybercrimes. I suspect this is mostly because organizations are highly incented to minimize bad publicity, so the majority of incidents go unknown by the general population. Organizations also generally feel like their current backups act as an insurance policy that will effectively recover from a ransomware attack, so “we’re good.” Human nature is to avoid the most unpleasant aspects of life, no matter how likely.

What is Ransomware?

The typical answer goes something like this: Ransomware is an especially sinister strain of malware. Simply put, once your system is infected, ransomware holds your data hostage by encrypting the files, rendering them illegible and unusable until a ransom is paid. While this is probably the answer you were expecting, it is not the accurate answer.

The correct answer? Ransomware is a business.  Ransomware was not designed by anarchists for the purpose of sabotage. Data destruction is not the end goal. The business of ransomware is to be lucrative, which means getting customers to pay the ransom. To achieve this, an effective ransomware attack must work to make the payment as user-friendly as possible and also eliminate all possibility of recovery without paying. Let’s take a look at the payment instructions from an example of ransomware called “Wannacry:”

Notice the simplicity— very similar to other modern simple software designs. The local language, the clarity of the instructions and helpful links to find more info. Cybercriminals want you to pay the ransom and the easier they make it to pay, the better. This also means the ransom is typically affordable. It has to make actual sense to pay up. When the ransom is paid, victims will get most of their data back most of the time but, as you are about to see, the process and pain involved in recovering from ransomware goes far beyond the amount of the ransom.

The True Costs of Ransomware

Gather round, it’s sumo story time. The full cost of a ransomware attack is not easy to calculate.  In one particular instance, I had a retail customer that was the victim of phishing (which we will discuss further). As a result, approximately 200 Windows servers were encrypted, including the backup server. This nightmare began when the backups IT thought would save them from cyberthreats actually became a liability.

The lost data typically meant that business systems were offline, and files were illegible. With no access to critical systems, the employees were sent home. They were no longer able to accept new orders or fulfill existing orders and lost all access to the customer records system. This lasted over 5 days, which meant customers were forced to pivot to alternate suppliers.

The decision even to pay the ransom is a pain-point. The FBI encourages victims to not pay ransom, which would ideally discourage future criminals. Most would agree with this stance. After all, no one wants to reward a cybercriminal that successfully attacked their organization and there is no guarantee that victims would actually recover their data. This is a noble thought but remember that ransomware is a business, and the requested ransom typically ranges from a few hundred dollars to several thousand dollars. These amounts are low enough that, in most cases, it actually makes good business sense to pay the ransom. Also, Wannacry is a relatively un-sophisticated variant compared to newer strains that are starting actually to threaten to leak customer’s sensitive data.

In our example, the pain was so severe that this organization easily decided to pay the ransom ($17K), which represented only a few hours of lost profits. After an uncomfortable discussion with the company’s leadership and finance team, another unforeseen obstacle was the payment currency. Cybercriminals don’t accept purchase orders and they don’t offer net-30-day payment terms. This organization did not have a corporate Bitcoin wallet. Eventually a third-party consultant was engaged to pay the ransom and recover the data, but this delay cost the company valuable time.

Surprisingly, most of the time customers do get the decryption keys to “unlock” their data once the ransom is paid. Again, it’s a business. If no one ever got their data back, ransomware would not be effective. Unfortunately, this particular customer was emailed a spreadsheet with 200x unmarked decryption keys. Imagine being handed a bag of 200 unmarked door keys. There are 200 doors, each with a matching key and your job is to match all 200 doors with keys. As you can imagine, this would take some time, even with a large staff and a shared google sheet. Also remember that there’s a time bomb strapped to the data. After the first three days, the price goes up, but after seven days, it’s gone forever. This fun little detail is just another tactic to remind victims to pay, and to do it quickly:

Even when a matching decryption key was found, the IT staff noticed that many of the recovered servers would crash midway through the decryption process. The encrypted files were actually not decrypted “in-place,” but rather copied, which doubled the data capacity consumed. Servers that were <50% disk utilization decrypted OK, but many were >50% and required a game of musical chairs with the underlying storage system in order to accommodate the unexpected increased disk usage on many systems.

In the end, approximately 7% of the servers were unable to be recovered because keys were not found, there were disk capacity issues, or the customer simply ran out of time.

When most of the systems were eventually recovered, they were no longer the supplier for their previous customers. The loss to revenues and productivity was obvious, but the organization couldn’t foresee the lasting loss of credibility.

Employee morale was also noticeably lower. The IT staff had simply lost all credibility with their peers. The workforce had assumed that IT was protecting them from such an incident. A reputation is a fickle thing. Think about giving your money to Bernie Madoff for a new investment. That sounds crazy, but is it any more irrational than trusting your data with a staff that just lost it?

How Ransomware Enters Your Data Center

The entry point can vary but there are two primary sources: 1) user action and 2) system vulnerabilities. Until the Borg entirely lobotomizes all humans, ransomware will remain a source of pain. An infected email attachment or link is the most frequent source, so a good email scanning/filtering system is essential, but real people will still show up with their own infected devices, they will still plug USB flash drives with loaded malware and still click on infected attachments.

Forrester also says approximately 18% of attacks come from phishing. This technique involves social engineering, which tricks users into thinking they should enter/change their password or account info. This is really a challenge of educating your users, but this brings us to an important first point – no filtration system will prevent 100% of incidents because humans are a vulnerability. Cybercriminals can target users with pinpoint precision and leverage a user’s own access and knowledge to infiltrate a network. This strategic, intentional, customized version of phishing even has a sub-category, aptly called spear-phishing. No matter how impenetrable your firewall may seem, there is still a human element involved.

The other source of ransomware is through system vulnerabilities. Most are software-based vulnerabilities such as RDP (remote desktop protocol) that live within a popular operating system, such as Windows. Brute-force attacks have successfully targeted Windows desktops and servers with RDP enabled, allowing a cybercriminal to have full control of a system on your network. Occasionally there are hardware vulnerabilities exposed such as the Intel processor-based Spectre/Meltdown just to keep things spicy. 

Another software vulnerability is SMB (Windows server message block, aka “CIFS”), which enables ransomware to encrypt file-sharing standards as well as spread throughout the network like a virus, encrypting more desktops and servers. This means corporate network file shares are fish in barrels for ransomware because the attack surface is enormous, but also because it is especially difficult to enforce access controls against an attack.

Preventing Ransomware – Modern Threats Require Modern Backups

Any good ransomware plan needs to start with a strong backup strategy. Firewalls are great but, as the last line of defense, your backups might be the only thing standing between you and the torture chamber described thus far. Unfortunately, most IT organizations rely on backup systems that are just as vulnerable as the rest of the servers. 

Thieves know that backups are an organization’s only chance, so backups are the first thing cybercriminals target. Most backup applications are built on Windows servers, which are vulnerable to the same open liabilities that allow hackers to access any other systems. Backups also typically use network-attached storage (NAS) for repository/backup disk capacity, which, as previously discussed, is a tasty target for criminals to inflict their encryption pain.

These two components of most backup solutions (Windows operating systems and network file shares) are no longer protecting customers, but rather have turned into liabilities that leave organizations exposed to being attacked on the very systems they count on to save them from these types of threats.

Furthermore, these types of older backup solutions are designed to restore a single item. In the case of ransomware, it is likely that EVERYTHING needs to be restored. This would typically take weeks, assuming the backups were not also encrypted.  Also, what prevent defense has been implemented to prevent an immediate repeat incident?

Cohesity Offers a New Approach

In this age of prolific widespread ransomware attacks, organizations need a new type of backup architecture to address this new form of modern threat. A new type of data protection that is designed to address ransomware would:

  • Provide air-gapped immutability from corruption
  • Detect the likeliness of ransomware and alert users
  • Offer large-scale instant recovery from incidents

Designed in the modern era, Cohesity was specifically architected to provide secure protection from modern cyberthreats such as ransomware. Trigger-warning: things are about to get nerdy.

First, Cohesity prevents ransomware. Cohesity is a hyper-converged platform, which means the compute, storage and software are tightly coupled in a “node” architecture that scales out. The fact that storage is entirely integrated means there are some magical automated protections built right in that create immutability.

After Cohesity creates a backup, the file system (SpanFS) immediately creates a “snapshot.” This snapshot copy is kept offline and NEVER exposed back to the network. Even when backup data need to be accessed, Cohesity creates a “clone” of the snapshot and uses that copy rather than the original, just in case a crafty hacker is waiting to attack. The important point to remember is there is always a gold copy kept securely offline.  This process happens completely automatically and illustrates a fundamental advantage of having vertically integrated software and hardware.

Because the software is tightly integrated, Cohesity has controls in place to prevent unauthorized access. Cohesity has multi-factor authentication to block unauthorized access.

In a worse-case scenario, a human error or particularly devious sabotage incident could result in deleted backups, whether accidental or intentional. That is why Cohesity developed DataLock™, a feature that defines data as non-deletable until it hits the predefined retention policy, even by super-users.

Second, Cohesity has an AI-based detection feature that scans for anomalies, such as change rates/encryption rates and mass deletes. This creates and an entropy “score.” If it is determined that a customer’s data has possibly been victimized by ransomware, Cohesity alerts customers so that they can take action immediately before more business systems are affected.

Third, Cohesity can recover from a ransomware incident with a feature aptly named Instant Mass Restore. This is crucial because in the unfortunate event of a ransomware attack, it’s likely most if not all of the organization’s systems were affected, and thus all need to be restored to a point before ransomware encrypted the files. Instant Mass Restore allows your organization to instantly recover all your servers, databases and files to a granular restore point just before ransomware infected your data center.

Conclusion

Cybersecurity has traditionally been network-based. Network security is important, but networks can only be babyproofed so much without severely degrading the user experience or obstructing productivity. A more modern approach can allow for easy protection from ransomware without locking down the user.

What if your organization was hit by ransomware but didn’t have to worry because your IT team could instantly restore EVERYTHING from an immutable copy? Network security will never prevent 100% of threats. It’s time for the backup and network teams to get more cohesive (insert wink emoji-face) with Cohesity.

Chris Colotti (@ccolotti) giving an in-depth tour of the CohesityonWheels Unstoppable Truck Roadshow