Disaster recovery best practices for Project Server 2007

Article
01/10/2017

This Office product will reach end of support on October 10, 2017. To stay supported, you will need to upgrade. For more information, see , Resources to help you upgrade your Office 2007 servers and clients.

Topic Last Modified: 2016-11-14

A disaster recovery plan should ensure that all your systems and data can be restored to normal operation quickly in the event of a natural disaster (such as a fire) or a technical disaster (such as a two-disk failure in a RAID-5 array). When you create a disaster recovery plan, you should identify all of the actions that must occur in response to a catastrophic event.

Thoroughly test your backup and recovery plan before deploying Microsoft Office Project Server 2007 in a production environment. When testing, look for vulnerable areas by simulating as many possible failure scenarios as you can. We recommend that you verify your disaster recovery plan by simulating the occurrence of a catastrophic event.

When planning your disaster recovery strategy, consider the following questions:

To what medium will you send the backup (tape or disk)?
Will you do the backups manually or schedule them to be done automatically?
If backups are automated, how will you verify that they successfully occurred?
How will you ensure that the backups are usable?
How long will you save the backups before reusing the medium?
Assuming failure, how much time will it take to restore from the most recent backup? Is that an acceptable amount of downtime?
Where will you store the backups, and do the appropriate people have access to them?
If the responsible system administrator is unavailable, is there someone else who knows the proper passwords and procedures to perform backups and, if necessary, to restore the system?

As part of any disaster recovery plan, we recommend that you do the following:

Use Microsoft Windows Event Viewer on a daily basis to check both the system log and application log on your production servers for any errors or warnings.
Always maintain an up-to-date Windows Emergency Repair disk or Automated System Recovery (ASR) set for each server in your deployment. See Windows Help for more information.

Ensure that all your servers are protected with adequate antivirus software. Keep the software up-to-date with the latest virus signature files. Use the automatic update feature of your antivirus application to keep the virus signatures current.

Types of events

System administrators must protect their networks from both data loss and system downtime. This involves both routine procedures performed on an ongoing basis and nonroutine steps taken to prevent or recover from unexpected downtime.

Some of the potential causes of system downtime include:

Hard disk subsystem failure
Power failure
System software failure
Accidental or malicious use of deletion or modification commands
Destructive viruses
Natural disasters
Theft or sabotage

The likelihood of these events varies depending on your organization, but all of them can adversely affect your Office Project Server 2007 deployment. We recommend that you assess your vulnerability to various types of events and take appropriate steps to minimize your organization’s exposure to them.

Hard disk space considerations

You must have enough space on your hard disk to restore both the database and the log files on the computers running SQL Server. You might have a backup that is too large to restore to its original location. For example, a Normal backup performed once per week plus six days of Differential backups might require more disk space during a restore than your server has available.

Also, you should never let your database drive become more than half full. Although a database drive that is less than half full results in unused disk space, it can still reduce extended server downtime for the following reasons:

You can restore databases faster than you can when a drive is full (especially if the file system is fragmented).
You can back up a copy of the databases to the same physical disk before you restore them, which enables you to attempt to repair the databases if a problem occurs during the restore process (for example, if the existing backup contains errors).

Using hardware standards

Adopt one standard for hardware, and apply it as much as possible. Use the same kinds of components, such as network cards, disk controllers, and graphics cards, on all of your computers. Use this standard computer profile for all applications, even if it is more than you need for some applications. The only modifications that you should make to the hardware are to the amount of memory, the number of CPUs, and the hard disk configurations.

Hardware standards provide the following advantages to your organization:

Having only one platform reduces the amount of testing needed.
When applying driver updates or application software updates, you only need to perform one test before deploying the updates to all of your computers.
Because only one type of system must be supported, support personnel require less training.
You do not need to keep as many spare parts on location, which reduces costs to the organization.

Keep spare and replacement parts on-site, and include spare equipment in any hardware budget. The number of spare parts that you keep on location varies according to the configuration and failure conditions that users and operations personnel can tolerate.

Some parts, such as memory and CPU, are easy to find years after the original parts are acquired. Other parts, such as hard disks, are often difficult to locate after only a few years. For parts that may be hard to find, and where exact matches must be used, plan to buy spares when you buy the equipment. Consider using service companies or contracts with a vendor to delegate the responsibility, or keep one or two of each of the critical components in a central location.

Maintaining hardware records

To limit the amount of time you spend troubleshooting hardware configuration problems during a disaster recovery, maintain current hardware configuration records, including:

A list of all hardware vendor contact information, including phone numbers, e-mail addresses, and Web pages for online support.
A list of the hardware in each server, with firmware update versions and hardware driver versions (this hardware information can be found in Windows Device Manager).
A list of the Basic Input/Output System (BIOS) information, hard disk configuration information, and jumper settings on the hardware for your server.

Important

Maintain a copy of this information off-site in case your facilities are damaged and you need to recover your systems in a new location.

Maintaining software records

To limit the amount of time you spend troubleshooting software-related problems during disaster recovery, maintain current software records, including:

Your software vendor contact information, including phone numbers, e-mail addresses, and Web pages for online support.
A chronological list of all software upgrades (such as service packs) and software patches that are installed on your servers. By keeping this list, you can install the software updates in the same order in which they were installed originally.
A record of the configuration for each server, including:
- Server name.
- Administrative group name to which the server belongs.
- Hard disk configuration information, including a list of each hard disk partition with the volume names and sizes of the partitions and a summary of what is installed on each partition.
- List of any static Internet Protocol (IP) addresses, subnet masks, and default gateways used by the server.
- A record of the cluster configuration information, if your topology includes clusters.
- Any customizations you made to the server, such as Project Web Access customizations.
- Configuration information for any Shared Services Providers, Web applications, sites, or other settings.
Important

Maintain a copy of this information off-site in case your facilities are damaged and you need to recover your systems in a new location.

Planning hardware contingencies

To minimize downtime costs, including losses in sales and productivity, keep replacement hardware immediately available for your production servers. Types of replacement hardware to consider having immediately available include alternate backup servers, network adapters, video and hard disk controller cards, routers, cables, hard disks, motherboards, and power supplies.

Providing training and documentation

Ensure that administrators, operators, and support staff within your organization have access to various training opportunities and documentation regarding disaster recovery issues.

If one or more of your servers experiences problems, the subsequent downtime can be costly. However, if you invest in good training courses and up-to-date technical manuals for your server administrators, operators, and support staff, your organization will be prepared, and downtime will decrease.

You can also perform occasional disaster recovery simulations in separate, non-production domains. These simulations familiarize administrators, operators, and support staff with recovery procedures, as well as indicate any deficiencies in your backup and recovery strategies. Update your documentation with any new procedures or practices you develop during these simulations.

Download this book

This topic is included in the following downloadable book for easier reading and printing:

Plan for disaster recovery in Project Server 2007

See the full list of available books at Downloadable content for Project Server 2007.