Performance efficiency can be defined as having a disk subsystem that can handle the IO requirements of the application that stores its files on it. In our case, that application is Exchange Server and the IO requirements will change based on the number of users who have their mailboxes on Exchange Server databases that we put on those disks, plus the IO "profile" of those users.
The best way to determine maximum throughput is to measure it. The JetStress tool is an excellent way to measure the maximum throughput of your disks. The documentation explains how to do this, so we'll skip that detail here. However, to use JetStress, you have to test your disks in a lab (not in a production environment). If you already have a server in production and suspect you have exceeded the maximum throughput, the best thing that you can do is make an estimate.
To make estimates (there are many tricks, but these are fairly simple):
-
Most disks can do between 130 to 180 IOPS.
-
Exchange Server typically has a Read-to-Write (R:W) ratio of 3:1 or 2:1.
-
We recommend that you plan to use less than 80 percent disk usage at peak load.
Raid 0 (striping) has the same cost as no raid. Reads and writes occur one time.
Raid 0+1 requires two disk IOs for every write (the mirrored data is written two times)
Raid 5 requires four disk IOs for every write (two reads, two writes to calculate and write parity).
For corporate servers that have a large number of users (500 users or more), the R:W ratios are usually 3:1 or 2:1. However, servers that have fewer than 500 users will have lower R:W ratios (approaching 0:1 as the number of users and the data in the database decreases). This is because for servers that have few users, much of the user’s data will be in the database cache. Therefore, some of the read actions will be satisfied by data in memory. This reduces the number of read operations. Of course, all the write operations will still have to be written to disk. Therefore, the net effect of having a smaller number of users on the server is that the ratio of R:W decreases.
To measure your R:W ratio, examine the ratio of LogicalDisk\Disk Reads/sec to LogicalDisk\Disk Writes/sec for the database drives.
The following tables can be used to look up the recommended maximum disk throughput per disk.
Estimated maximum disk throughput for No Raid or Raid 0
|
R:W ratio \ Disk speed
|
130 IOs per second
|
180 IOs per second
|
|---|
|
3:1
|
104 IOPS
|
144 IOPS
|
|
2:1
|
104 IOPS
|
144 IOPS
|
|
0:1
|
104 IOPS
|
144 IOPS
|
Estimated maximum disk throughput for Raid 0+1 (or Raid 10)
|
R:W ratio \ Disk speed
|
130 IOs per second
|
180 IOs per second
|
|---|
|
3:1
|
83 IOPS
|
115 IOPS
|
|
2:1
|
78 IOPS
|
108 IOPS
|
|
0:1
|
52 IOPS
|
72 IOPS
|
Estimated maximum disk throughput for Raid 5
|
R:W ratio \ Disk speed
|
130 IOs per second
|
180 IOs per second
|
|---|
|
3:1
|
59 IOPS
|
82 IOPS
|
|
2:1
|
52 IOPS
|
72 IOPS
|
|
0:1
|
26 IOPS
|
36 IOPS
|
You can safely assume that you can obtain a throughput of 80 IOs per second for most disks, in a raid 0+1 configuration (as Raid 0+1 is generally recommended for most database drives).
How much IO does one Exchange Server client create in an environment?
User IO requirements are measured in IO/second also referred to as IOP. To measure your current user IOP, you should monitor performance of disk subsystem.
Measure the physical disk\disk transfers per second for all databases for between 20 minutes to 2 hours during your most active time. During this time, measure the number of active users (MSExchangeIS\ Active User Count). Take an average of these counters. Sum the disk transfers/sec for each database, divide the first number by the second and you have calculated the number of IOPS per user.
Note: |
|---|
|
The Active User Count number is not a perfect representation of number of users. From experience, it is very close to reality and is a good starting point when you make calculations.
|
Be aware that the number of IOPS per user is determined by how active the clients are. You may find that this differs from server to server and database to database. These numbers are used as guidelines, but accurate numbers are not always necessary as long as you build in a bit of overhead when planning and populating your servers. However, you can use these numbers to help decide when you want to move users from a busy server to another server.
Note: |
|---|
|
As a general practice, it is a good idea to always measure the server when it is at peak load. When sizing servers, always plan for maximum usage and then leave some buffer overhead for those days when all the users return from a holiday break.
|
Now that we know how to measure (through JetStress) or estimate (from earlier) maximum disk throughput, and know the IOPS per user, it is a simple task to plan for how many disks we need for a new server.
Assuming the new users have a similar e-mail usage profile (are using the same clients, have the same percentage of e-mail plugins, send about the same mail), here is how it is done:
Calculate the throughput needed (multiply the number of users on the new server by the number of IOPS per user).
Divide the throughput by the maximum throughput of the disks used. Use the numbers from the earlier table, or the result from JetStress * 0.8. The numbers in the table earlier already include the 80 percent max usage to build in some overhead.
Round up. This gives us the minimum number of disks needed for the server. Next, divide by the number of databases, and round up. This gives us the number of disks needed per database (or repeat with storage groups if the databases share the same physical drive).
Example Suppose we are hiring 5000 people, and we want to figure out how to size our server. Current users must have 0.4 IOPS per user, and we expect the new users to be as hard working as our current employees. We will need a total of 2000 IOPS.