Maintaining high availability to corporate data stores managed with SQL Server is an essential element of any data-management strategy.
Without the uninterrupted ability to store and retrieve data, business would grind to a halt. Besides its people, data is increasingly the most important asset in any enterprise. And SQL Server 2008 or SQL Server 2008 R2 are frequently at the core of any data-management strategy. So, if you think about it that way, the developers and DBAs are the ones responsible for keeping the business running.
However, how much definitive guidance filters down from the business unit managers to those responsible for the data tier? Are the business requirements communicated clearly? Are they communicated in such a way that technology professionals can translate them into a productive strategy?
In certain market segments, there are stringent regulatory requirements around infrastructure aspects like security auditing, data encryption and data retention. Failure to meet those requirements can lead to the business being fined or censured, plus losing public credibility and future revenue—often the worst thing that can happen in a highly competitive market.
There are certain business requirements that seem easier or more straightforward for business leaders to communicate, like those around security, reporting, workload management and auditing. Fortunately, those are also easier for you to implement within the framework of SQL Server 2008:
However, there are two major business requirements that are often poorly communicated: system downtime and acceptable data loss. These are known as Recovery Time Objective (RTO) and Recovery Point Objective (RPO), respectively. Unfortunately, it’s quite common for business managers to neglect considering RTO and RPO, only to find out that the data tier isn’t protected to the extent they’d like when a disaster occurs, which leads to downtime or data loss.
Whether you’re a business manager or a DBA, take a minute right now to consider if you know for sure that the data tier is protected to the extent the business requires. If you come to the realization that it isn’t, what’s your plan to resolve the situation?
Neither panic nor complacency is an appropriate reaction. Conducting a fire drill to put a strategy in place by next week is a recipe for disaster in itself. It takes care and diligence to design and implement an appropriate and comprehensive SQL Server high-availability and disaster-recovery (HA/DR) strategy. Ignoring the problem invites calamity and is tantamount to business negligence.
The key to designing a successful strategy is first working out the business requirements. Then you have to balance those against the business limitations. This is where IT and business unit leaders have to meet face to face and see things eye to eye. The strategic requirements have to encapsulate the following factors for each portion of data relevant to business operations:
A good way to think about these is to consider the impact on the business of each portion of data being inaccessible or lost. You may be surprised when you think through and quantify the potential ramifications for your customers, your business image and your regulatory controls.
One of the more-common mistakes when designing and implementing an HA/DR strategy is moving ahead with the technical design without first considering the limiting factors. This means either returning to the drawing board—a waste of time and money—or implementing a sub-standard strategy that doesn’t meet the business requirements.
There are many limitations, both technical and non-technical. The overriding factor is usually the budget. More hardware means more power, which means more heat dissipation, which means more air conditioning, which means even more power, which all means more space required and more budget dollars allocated to that physical infrastructure. You must also consider the cost of the hardware, the extra SQL Server and Windows licenses, network bandwidth, and possibly even more personnel to manage the extra systems and the rest of the datacenter.
Once you’re familiar with all the technical limitations, the trick is to arrive at the most effective compromise. You need a prioritized list of the data that’s most important to the business. Given the limitations under which you’re working, you’ll be evaluating the technologies that help you meet the most important business requirements.
It’s essential that you don’t just try to adapt an incumbent technology to meet new business requirements. Don’t jump into or choose a technology without properly evaluating it against your business priorities. It’s always better to put in the effort early on and go through the process properly. You’ll end up with a better strategy that saves you time and money in the long run.
If you find you can’t meet the business requirements with the technologies you can afford, you’ll need to work with the business unit leaders to change those requirements to reflect the budgetary realities. As a technologist, for example, there’s no point agreeing to a business requirement of zero data loss if there are insufficient funds in the budget for synchronous technology. When a disaster occurs, the business managers’ expectations will not be met and fault for the situation will fall to the IT staff.
One of the hardest things to do when designing your HA/DR strategy is often ensuring that it forms a harmonious component of your organization’s overall IT strategy. For instance, if you’re the DBA in a large corporation, there are likely other teams responsible for the Windows servers, the network, the storage, the building infrastructure and so on. If the business requires a particular database to be available within four hours of any disaster, you may need to involve all these teams to make sure that can happen. This brings inter-team politics and relationships into the mix. The other teams must agree to meet the same service-level agreements as the data-tier team, and the expectations of and promises to the overall business.
If you feel your data tier is not adequately protected, it’s likely that the HA/DR strategy isn’t being properly tested in your business, either. It’s imperative when you go through the process of designing and implementing an HA/DR strategy that you actually test the system so it can respond in the event of a crisis.
This is easier said than done, though. Convincing the business managers to conduct a test that may result in downtime is a challenge. You could present the argument that it’s better to conduct a test when everyone is on-site, expecting a “disaster,” and ready to step in and quickly fix any problems. The alternative may be to discover a flawed design when a disaster occurs at 2 a.m. on a company holiday when there’s only a skeleton staff on hand.
Even though you may discover that your data tier isn’t protected to an acceptable degree from downtime and data loss, there are plenty of options for implementing an HA/DR strategy using SQL Server 2008 or SQL Server 2008 R2. Understand the various technologies and their trade-offs, and examine the architectures that other companies have successfully deployed. Check out the following white papers and blog posts for more information:
Make sure you work toward putting a valid strategy in place. It’s the only way to protect your business and avoid unexpected downtime.