With the advance of Cloud Technology, companies today are looking at a modern approach to Disaster Recovery (DR) which leverages the flexibility offered with Cloud Computing Infrastructure as a Service models. There are four key elements to implementing an effective DR strategy. Over the next four weeks, I will feature a blog on each element which highlights the importance and approach now that Cloud has become a very viable option for companies.
Start with a Business Impact Analysis (BIA)
For those not familiar with this term, this is simply the process of working with the various application owners, and assessing a) how long a particular application can afford to be down (aka Recovery Time Objective RTO) and b) how frequently you need to backup data for a given dataset so you know how much you are comfortable losing in event of an issue (aka Recovery Point Objective). As a general rule, the lower these two figures are as measured in time, the more expensive the solution to provide might be. Cristian from Veeam does a great job in this blog of helping define this further.
Performing the BIA
This sounds easy in principle. Just assign a project lead to interview each department for each application and ask these two questions; 1) How long can you be down 2) How much data can you afford to lose/recreate. Without experience in leading this discussion and truly forcing the line of business owners to consider these answers in the terms of dollars lost over time, you are bound to hear the same answer from every group. "We cannot afford to be down, or lose any data, period". Performing a proper BIA is essential to rightsize your eventual DR approach. This video snippet here will show how you might build this out internally with the help of a instructional firm. But, do you really want to invest in training internally for a one shot deal? I suggest hiring a consultant to perform this.
What are the logical RPO RTO Categories in today's Cloud Era
This is where I feel the BIA has become more important that ever. Cloud companies have the ability to spin up servers in hours or even minutes. So do you really need everything you have in production, waiting for you in event of a disaster. I great friend of mine from Louisiana who had well over 100 self taught cliches once told me, "Don't build the church for Easter Sunday." That principle is very true with cloud based DR. So I think your true goal is to categorize into 3 categories. 1) ZERO RTO - These are the applications that you absolutely cannot afford to be down. These need high availability servers locally, and tools from our partners such as Zerto (hence the name). 2) 1-4 hour RTO - In this category, simply having some minimal cloud instance on standby makes sense. 3) Beyond 4 hour RTO - In this case, buy it when/if you need it.
Once you have categorized your workloads, the next step is to build out a Disaster Recovery Plan. We will talk about that in part 2 next week.