Capability: Data Protection and Recovery

On This Page

Introduction
Requirement: Defined Backup and Restore Services for Critical Servers
Checkpoint: Defined Backup and Restore Services for Critical Servers

Introduction

Data Protection and Recovery is the fourth Core Infrastructure Optimization capability. The following table lists the high-level challenges, applicable solutions, and benefits of moving to the Standardized level in Data Protection and Recovery.

Challenges

Solutions

Benefits

Business Challenges

No standard data management policy, which creates isolated islands of data throughout the network on file shares, nonstandard servers, personal profiles, Web sites, and local PCs

Poor or non-existent archiving and backup services makes achieving regulatory compliance difficult

Lack of disaster recovery plan could result in loss of data and critical systems

IT Challenges

Hardware failure or corruption equates to catastrophic data loss

Server administration is expensive

IT lacks tools for backup and restore management

Projects

Implement backup and restore solutions for critical servers

Consolidate and migrate file and print servers to simplify backup and restoration

Deploy data protection tools for critical servers

Business Benefits

Effective data management strategy drives stability in the organization and improves productivity

Standards for data management enable policy enforcement and define SLAs, improving the business relationship to IT

Strategic approach to data management enables better data recovery procedures, supporting the business with a robust platform

Organization is closer to implementing regulatory compliance

IT Benefits

Mission-critical application data are kept in a safe place outside of the IT location

Basic policies have been established  to guarantee access to physical media (tapes, optical devices) when necessary

The Standardized Level in the Infrastructure Optimization Model addresses key areas of Data Protection and Recovery, including Defined Backup and Restore Services for Critical Servers. It requires that your organization has procedures and tools in place to manage backup and recovery of data on critical servers.

Requirement: Defined Backup and Restore Services for Critical Servers

Audience

You should read this section if you do not have a backup and restore solution for 80 percent or more of your critical servers.

Overview

Backup and recovery technologies provide a cornerstone of data protection strategies that help organizations meet their requirements for data availability and accessibility. Storing, restoring, and recovering data are key storage management operational activities surrounding one of the most important business assets: corporate data.

Data centers can use redundant components and fault tolerance technologies (such as server clustering, software mirroring, and hardware mirroring) to replicate crucial data to ensure high availability. However, these technologies alone cannot solve issues caused by data corruption or deletion, which can occur due to application bugs, viruses, security breaches, or user errors.

There may also be a requirement for retaining information in an archival form, such as for industry or legal auditing reasons; this requirement may extend to transactional data, documents, and collaborative information such as e-mail. Therefore, it is necessary to have a data protection strategy that includes a comprehensive backup and recovery scheme to protect data from any kind of unplanned outage or disaster, or to meet industry requirements for data retention.

The following guidance is based on the Windows Server System Reference Architecture implementation guides for Backup and Recovery Services.

Phase 1: Assess

The Assess Phase examines the business need for backup and recovery and takes inventory of the current backup and recovery processes in place. Backup activities ensure that data are stored properly and available for both restore and recovery, according to business requirements. The design of backup and recovery solutions needs to take into account business requirements of the organization as well as its operational environment.

Phase 2: Identify

The goal of the Identify Phase of your backup and recovery solution is to identify the targeted data repositories and prioritize the critical nature of the data. Critical data should be defined as data required for keeping the business running and to comply with applicable laws or regulations. Any backup and recovery solutions that are deployed must be predictable, reliable, and capable of complying with regulations and processing data as quickly as possible.

Challenges that you must address in managing data include:

  • Managing growth in the volumes of data.

  • Managing storage infrastructure to improve the quality of service (QoS) as defined by service level agreements (SLAs), while reducing complexity and controlling costs.

  • Integrating applications with storage and data management requirements.

  • Operating within short, or nonexistent, data backup windows.

  • Supporting existing IT systems that cannot run the latest technologies.

  • Managing islands of technology that have decentralized administration.

  • Assessing data value so that the most appropriate strategies can be applied to each type of data.

While the backup and restoring of all organizational data is important, this topic addresses the backup and restore policies and procedures you must implement for critical services to successfully move from a Basic level to a Standardized level.

Phase 3: Evaluate and Plan

In the Evaluate and Plan Phase, you should take into account several data points to determine the appropriate backup and recovery solution for your organization. These requirements can include:

  • How much data to store.

  • Projected data growth.

  • Backup and restore performance.

  • Database backup and restore needs.

  • E-mail backup requirements.

  • Tables for backups and restores.

  • Data archiving (off-site storage) requirements.

  • Identification of constraints.

  • Select and acquire storage infrastructure components.

  • Storage monitoring and management plan.

  • Testing the backup strategy.

See Microsoft Operations Framework Storage Management for more details.

Backup Plan

In developing a backup and recovery plan for critical servers you need to consider these factors:

  • Backup mode

  • Backup type

  • Backup topology

  • Service plan

Microsoft’s Data Protection Manager (DPM) is a server software application that enables disk-based data protection and recovery for file servers in your network. The DPM Planning and Deployment Guidecontains a wealth of information on setting up a backup and recovery plan.

Backup Modes

The backup mode determines how the backup is carried out in relation to the data that is being backed up. There are two ways in which data backups can take place:

  • Online Backups. Backups are made while data is still accessible to users.

  • Offline Backups. Backups are made of data that is first rendered inaccessible to users.

Backup Types

Various types of backups can be used for online and offline backups. An individual environment’s SLA, backup window, and recovery time requirement determine which method or combination of methods is optimal for that environment.

  • Full Backup. Captures all files on all disks.

  • Incremental Backup. Captures files that have been added or changed since the last incremental backup.

  • Differential Backup. Captures files that have been added or changed since the last full backup.

Backup Topologies

Originally, the only type of storage technology that required backup involved hard disks connected directly to storage adapters on servers. Today, this kind of storage is known as direct-attached storage, or DAS. The backup and recovery landscape has changed markedly with the development of technologies such as Storage Area Network (SAN) and Network Attached Storage (NAS). SAN environments in particular provide a significant opportunity to optimize and simplify the backup and recovery process.

  • Local Server Backup and Recovery (DAS). Each server is connected to its own backup device.

  • LAN-Based Backup and Recovery (NAS). This is a multi-tier architecture in which some backup servers kick off jobs and collect metadata about the backed-up data (also known as control data) while other servers (designated as media servers) perform the actual job of managing the data being backed up.

    LAN-Based Backup and Recovery

  • SAN-Based Backup and Recovery. In this topology you have the ability to move the actual backup copy operation from the production host to a secondary host system.

    SAN-Based Backup and Recovery

Service Plan

You have to consider many factors when designing your backup and recovery service. Among the factors to consider are:

  • Fast backup and fast recovery priorities – Recovery Time Objective (RTO).

  • The frequency with which data changes.

  • Time constraints on the backup operation.

  • Storage media.

  • Data retention requirements.

  • Currency of recovered data – Recovery Point Objective (RPO).

For more information on RTO and RPO, go to https://www.microsoft.com/technet/technetmag/issues/2006/10/FailoverClusters/.

Recovery Plan

Even the best backup plan can be ineffective if you don’t have a recovery plan in place. Following are some of the elements of a good data recovery plan.

Verify Backups

Verifying backups is a critical step in disaster recovery. You can't recover data unless you have a valid backup.

Back Up Existing Log Files Before Performing Any Restoration

A good safeguard is to back up any existing log files before you restore a server. If data is lost or an older backup set is restored by mistake, the logs help you recover.

Perform a Periodic Fire Drill

A drill measures your ability to recover from a disaster and certifies your disaster recovery plans. Create a test environment and attempt a complete recovery of data. Be sure to use data from production backups, and to record how long it takes to recover the data. This includes retrieving data from off-site storage.

Create a Disaster Kit

Plan ahead by building a disaster kit that includes an operating system configuration sheet, a hard disk partition configuration sheet, a redundant array of independent disks (RAID) configuration, a hardware configuration sheet, and so forth. This material is easy enough to compile, and it can minimize recovery time—much of which can be spent trying to locate information or disks needed to configure the recovery system.

Phase 4: Deploy

After the appropriate storage infrastructure components are in place and the backup and recovery service plan is defined, your organization can install the storage solution and associated monitoring and management tools into the IT environment.

Operations

Monitoring and managing storage management resources for backup and recovery used in the production environment are extremely important tasks. Whether the process is centralized or distributed, the technologies and procedures for backup and recovery must be managed. In the end, the capability to easily monitor and analyze the storage management systems availability, capacity, and performance should be available.

Storage resource management (SRM) is a key storage management activity focused on ensuring that important storage devices, such as disks, are formatted and installed with the appropriate files systems.

Typically, the tools used in the production environment to monitor and manage storage resources consist of functions provided as part of installed operating systems and/or those offered with other solutions, such as Microsoft Data Protection Manager.

Using a storage resource management system requires proper training and skills. An understanding of some of the basic concepts necessary for monitoring and managing storage resources successfully, and analyzing the results, is required. In addition, selecting the right tool for the right job increases the operations group’s ability to ensure data and storage resource availability, capacity, and performance.

Further Information

Checkpoint: Defined Backup and Restore Services for Critical Servers

Tick

Requirements

Created a data backup plan and a recovery plan for 80 percent or more of your critical servers.

 

Used drills to test your plans.

If you have completed the steps listed above, your organization has met the minimum requirement of the Standardized level for Defined Backup and Restore Services for Critical Servers.

We recommend that you follow additional best practices addressed in the Backup and Recovery Services Implementation Guides of the Windows Server System Reference Architecture and Microsoft Operations Framework Storage Management.

Go to the next Self-Assessment question.