Monitoring Commerce Site Availability

Article
11/12/2009

Create an availability checklist to monitor the availability of your site. The availability checklist should contain the items in this section.

Monitor Bandwidth Usage: Per Day, Week, and Month

Monitor Network Availability

Monitor System Availability

Monitor HTTP Availability

Monitor Performance Metrics

Monitor Bandwidth Usage: Per Day, Week, and Month

Bandwidth. How bandwidth is being used (peak and idle).
Usage. How usage increases (if it increases, when it increases, and how long it increases).

You can use this information to project how much bandwidth you will need in the future. This will enable you to plan for the peak bandwidth you need for a holiday shopping season.

You can get bandwidth usage data from managed routers and Internet Information Services (IIS) 5.0 log analysis (using the Commerce Server Data Warehouse).

Monitor Network Availability

Use Network Internet Control Message Protocol (ICMP) echo pings, which are available in most network monitoring software.

Compare your network availability to the level agreed to in your service level agreement (SLA) with your Internet Service Provider (ISP)/data center provider. Request improvement if network availability falls below the level agreed to in the SLA.

The formula for measuring network availability is as follows:

(Number of successful ping returns/number of total pings issued) x 100%

Monitor System Availability

Monitor the following systems:

Operating system. Monitor normal and abnormal shutdowns of the operating system.
SQL Server. Monitor normal operation and failover events of Microsoft SQL Server.
Internet Information Services (IIS). Monitor normal and abnormal shutdowns in IIS.

The formula for measuring system availability is as follows:

(Period of measurement-downtime)/period of measurement) x 100%

Monitor HTTP Availability

Monitor the following HTTP requests:

HTTP requests (internal). Monitor HTTP requests issued internally.
HTTP requests (per ISP). Monitor HTTP requests issued from ISP networks to track whether or not users of the monitored ISP networks can access your site.
HTTP requests (per geographic location). Monitor HTTP requests issued from different geographic locations (New York, San Francisco, London, Paris, Munich, Tokyo, Singapore, and so on) to track whether or not users from respective areas of the world can access your site.

Downtime occurs when the site fails to return a page or returns a page with an incorrect response. The formula for measuring HTTP availability is as follows:

(Number of successful HTTP requests/number of total HTTP requests issued) x 100%

Monitor Performance Metrics

Monitoring performance is not strictly part of monitoring availability. However, monitoring performance can sometimes provide advance warning about potential problems that can affect availability if you do not address them.

Monitor the following performance metrics:

Number of visits (per day/week/month). Monitor site traffic information to assess the level of site activity. This data is available from the Data Warehouse.
Latency of requests for sets of operations and page groups (per day/week/month). Compare these metrics to your transaction cost analysis (TCA) test results to see how site performance compares to TCA predictions and to identify system bottlenecks.
CPU utilization (per day/week/month). Monitor use on Microsoft Windows servers, SQL Server servers, IIS/Commerce Server servers, middleware, and so on. Group servers by function to make it easier to track and plan site capacity.
Disk storage. Group servers by function and monitor disk capacity (total disk capacity and free space). Review weekly and monthly history, so you can spot trends and plan for expansion.
Disk I/O. Group servers by function and monitor disk input/output (I/O) throughput. Compare weekly and monthly history with the disk I/O rating provided by the manufacturer. If the observed I/O nears the disk I/O, consider adding more spindles (adding more drives to the drive stripe set) or redistribute disk I/O to multiple disk controllers.
Fiber channel controller/switch bandwidth. Monitor system area network (SAN) fiber channel controller bandwidth. (A SAN is typically used to interconnect nodes within a distributed computer system, such as a cluster. These systems are members of a common administrative domain and are usually in close physical proximity. A SAN is physically secure.) If the observed bandwidth nears the throughput rating provided by the manufacturer, consider adding more controllers and switches to redistribute traffic and get more aggregate bandwidth.
Memory. Make sure that the amount of available memory is greater than 4 MB. If the system nears this level during peak usage, add more memory to the server.

Monitoring Commerce Site Availability

Monitor Bandwidth Usage: Per Day, Week, and Month

Monitor Network Availability

Monitor System Availability

Monitor HTTP Availability

Monitor Performance Metrics

Additional resources