Anticipating the Effect of DPM Operations on Performance

Published : April 8, 2005 | Updated : August 17, 2005

The effect of DPM on your network and file server performance depends on your network speed, the size of the data being protected, and the rate at which the protected data changes. The following sections provide guidelines for estimating your data change rate and sample performance statistics to help you anticipate how DPM will affect your file servers and network.

Estimating Data Change Rate

The effect of DPM on your network performance depends largely on your data change rate. To get an estimate of your data change rate, you can review an incremental backup for a recent, average day. The percentage of your data included in an incremental backup is usually indicative of your data change rate. For example, if you have a total of 100 GB of data and your incremental backup includes 10 GB, your data change rate is likely to be approximately 10 percent each day.

Note, however, that because the method that DPM uses to record changes to data is different from that of most backup software, incremental backup size is not always a precise indicator of data change rate. To refine your estimate of your data change rate, consider the characteristics of the data you want to protect.

For example, whereas most backup software records data changes at the file level, DPM records changes at the byte level. Depending on the type of data that you want to protect, this can translate to a data change rate that is lower than the incremental backup might suggest.

Similarly, whereas most backup software includes only the cumulative changes to data between incremental backups, DPM includes each change made to protected data. If your protected data includes a substantial number of files that you frequently overwrite, your data change rate may be higher than your incremental backup would indicate.

Depending on the characteristics of your data, it may be prudent to assume a data change rate that is 1.5 to 2 times higher than the rate indicated by your incremental backup.

Anticipating Performance During Normal Use

The following sections describe the effect of DPM on network and file server performance during periods of normal use.

Network Performance

Throughout the day, DPM periodically performs synchronization on the replicas for protected volumes. Synchronization is the process by which DPM transfers data changes from a protected file server to a DPM server, and then applies the changes to the replicas of the protected data. Depending on the protection schedules that you specify, DPM may synchronize data at intervals from once each hour to once each week.

The amount of network bandwidth that a synchronization job uses depends on the amount of data that has changed since the previous synchronization job.

For example, suppose that you are protecting a file server that stores 500 GB of data, and that 10 percent of the protected data, or about 50 GB, changes over an eight-hour period each day. If synchronization is scheduled to run hourly, then each synchronization job that runs during the eight-hour period transfers approximately 50 GB divided by 8, or 6.25 GB, of data. If the data is transferred over a network with a capacity of 100 Mbps (12.5 MB per second), and you have set network bandwidth usage throttling for the protected data to 50 percent, then approximately 6 MB per second of network bandwidth is available for synchronization. At this rate, synchronization takes approximately 20 minutes to complete.

For information about setting network bandwidth usage throttling for protected data, see “Performance Optimization Options” in the “Planning Data Protection” chapter of this guide.

File Server Performance

During normal use of DPM, the DPM File Agent tracks changes to protected data. The file agent incurs an overhead similar to that of antivirus software.

Anticipating Performance During Intensive Use

The DPM operations that typically have the greatest effect on network performance are automatic replica creation and synchronization with consistency check.

To help you anticipate how DPM might affect your network during periods of intensive use, the sections that follow provide estimates for the amount of time it takes DPM to complete these operations under various conditions.

Note

Your experience may vary from the estimates provided here. The time required to complete replica creation or synchronization with consistency check jobs depends on many factors, including the number of files protected, data change patterns, and resource competition from other workloads.

Replica creation

Replica creation is the process by which a complete copy of data selected for protection is transferred to the storage pool. Replicas can be created automatically over the network, or manually using removable media such as tape. Table 2.2 provides estimates for how long DPM takes to create a replica automatically over the network given different protected data sizes and network speeds. The estimates assume that the network is running at full speed and that other workloads are not competing for bandwidth. Times are shown in hours.

Table 2.2   Time Required to Complete Replica Creation Jobs at Different Network Speeds

Size of Protected Data

100 Mbps

32 Mbps

8 Mbps

2 Mbps

512 Kbps

1 GB

< 1

< 1

< 1

1.5

6

50 GB

1.5

5

18

71

284

200 GB

6

18

71

284

1137

500 GB

15

45

178

711

2844

For replica creation jobs that involve transfer of a large amount of data over a WAN or other slow network, we recommend that you use the manual option for creating the replica.

Note

The data provided in Tables 2.2 and 2.3 is the result of testing without the benefit of on-the-wire compression. To reduce the time required for replica creation and synchronization with consistency check jobs, you can configure on-the-wire compression for protection groups. For information about on-the-wire compression and other performance optimization features, see “Performance Optimization Options” in the “Planning Data Protection” chapter of this guide.

Synchronization with consistency check

Synchronization with consistency check (or simply “consistency check”) is the process by which DPM checks for and corrects inconsistencies between protected volumes and their replicas. The time to complete a consistency check depends on the change rate for the data, the available network bandwidth, and the time that has passed since the previous successful synchronization job.

Table 2.3 provides estimates for how long DPM takes to complete synchronization with consistency check for protection groups based on the amount of data that changes daily and the speed of the network. These estimates assume that the last successful synchronization job was completed 24 hours earlier, that the network is running at full speed, and that other workloads are not competing with DPM operations for bandwidth. Times are shown in hours.

Table 2.3   Time Required to Complete Consistency Checks at Different Network Speeds

Size of Daily Changes

100 Mbps

32 Mbps

8 Mbps

2 Mbps

512 Kbps

100 MB

< 1

< 1

< 1

< 1

1

5 GB

< 1

1

2

7

28

20 GB

1

2

7

29

111

50 GB

3

5

19

72

279

Had the last successful synchronization job finished fewer than 24 hours earlier, the size of the changed data would likely be smaller, and the time to complete a consistency check would be shorter than the times shown here. Conversely, had the last successful synchronization job finished more than 24 hours earlier, the size of the changed data would probably be larger, and the time to complete a consistency check would be longer. Similarly, were less network bandwidth available, times would be longer.

To avoid long consistency check times, ensure that sufficient network bandwidth is available and that consistency checks are performed as soon as possible after a replica becomes inconsistent. To help you monitor the state of your replicas and other aspects of DPM operations, you can subscribe to notifications. For instructions, see “Subscribing to Notifications” in the “Configuring DPM” chapter of this guide.

 Considerations for protecting data over a WAN

DPM includes a number of features that can be useful for optimizing network performance. These features can be advantageous in any DPM deployment, but they are particularly important in deployments that involve transfer of data over a WAN. Performance optimization features to take into consideration in any DPM deployment, but especially in deployments that involve a WAN, include:

  • Manual creation of replicas using removable media

  • On-the-wire compression

  • Network bandwidth usage throttling

  • Options for allocating space for synchronization logs

Each of these features is discussed in the “Planning Data Protection” chapter of this guide.