Appendix A: The Effect of a Site Failure

Published : September 1, 2004

Before you start to evaluate recovery options for your site and make decisions about allocating resources for backup and recovery, you must understand the impact of a site failure.

An SMS site consists of various site systems and clients. Each site has one site system that is the site server. Site servers are the most important servers in SMS sites. The site server monitors and manages the site. It is used to initiate many administrative tasks such as distributing new software to clients, troubleshooting problems on client computers, and metering software on the clients.

Site servers are also significant in that you use them to create new site systems for the site. If the site server is not functioning when a remote site system fails, you cannot set up a new site system until you recover the site server. You can back up site systems, and then restore them, if needed, but that requires a large investment, and in general, it is not a recommended strategy.

In the event of a site server failure, your ability to manage the site’s clients is extremely limited. Even if you do not need the site server to create new site systems, the other management tasks that are no longer available make it almost impossible to manage the site.

Each site has one site system that is the SMS site database server, and each site has at least one (possibly more) site system that is a client access point (CAP), and possibly one or more site systems which are management points. The CAP and the management point are the site systems that allow the site server and the site’s clients to communicate.

In addition to these essential site systems, a site can have any number of additional site systems that perform different roles in the site. For example, to distribute software, a site must have at least one distribution point. To support Advanced Clients, a site must have at least one management point. SMS supports site systems that are set up either on the same computer as the site server or on a separate computer.

The site server and any of the other site systems can fail and thus not provide the services they regularly provide. If multiple site systems are installed on the same computer, then if that computer fails, all services regularly provided by those site systems are no longer available.

Because each site system in the site provides different functionality, the impact of a failure on the site is different, depending on the role of the site system that failed. The following table contains information that can assist you in evaluating the impact of failure in your site.

Table A.1  Impact of Site System Failure

SMS site system

Impact of failure

Site server

The failed site server cannot receive or process any data from the site’s clients, from child sites, or from parent sites. Therefore, data such as hardware and software inventory and status messages from clients accumulates at the site systems.

You cannot perform site management tasks from the failed site server, such as:

Remote control and troubleshoot clients.

Managing or creating software distribution packages, programs, or advertisements.

Managing or creating new software metering rules

Managing or creating new collections.

Managing or creating reports or dashboards.

Managing or creating site systems

SMS site database server

The site server cannot access the site database to store or to retrieve data. You cannot perform any administrative task from the site server.

Distribution point

BITS-enabled distribution point

You cannot store software distribution packages on the failing distribution point. Therefore, clients must use alternative distribution points or they will not be able to run advertised programs.

You cannot perform Software Distribution tasks such as:

Distributing or managing software distribution packages using the failing distribution point.

Enabling or disabling BITS support on the failing distribution point.

Clients cannot use the failing distribution point to perform tasks such as:

Running advertised programs.

Client access point

No data propagates between the site server and the site’s Legacy Clients through the failing CAP. Software and hardware inventory data and status messages from Legacy Clients cannot reach the site server. Configuration data from the site server cannot reach Legacy Clients. Legacy Clients must use an alternative CAP if one exists.

The failing CAP cannot perform tasks such as:

Discovering or installing new Legacy Clients.

Updating configuration data of client agents for Legacy Clients.

Managing advertisements and advertising new programs to Legacy Clients.

Legacy Clients cannot use the failing CAP to perform tasks such as:

Receiving information about new advertisements.

Updating site-wide client configuration settings.

Updating the Software Metering rule base.

Management point

No data propagates between the site server and the site’s Advanced Clients through the failing management point. Software and hardware inventory data and status messages from Advanced Clients cannot reach the site server. Configuration data from the site server cannot reach Advanced Clients.

The failing management point cannot perform tasks such as:

Discovering new Advanced Clients.

Processing data from Advanced Clients such as software and hardware inventory.

Process and propagating policies from site server to Advanced Clients

Advanced Clients cannot use the failing management point to perform tasks such as:

Receiving information about new advertisements.

Updating Software Metering rule base.

There is no failover from a resident management point to another management point.

If both the assigned management point and the resident management point are failing,  all client functions are affected.

Server locator point

Clients cannot be installed using Logon Script-initiated Client Installation method (capinst.exe).

Reporting point

You cannot use the failing reporting point to view reports.

If any of the site systems listed in Table 1.3 remain in a failed state for a long period of time, clients can be severely impacted. However, the impact on clients is minimal if you restore functionality to the failing site system quickly.

For a short period of time, there is minimal impact on the following client operations:

  • Advertised programs that were scheduled to run before a CAP failed, will run at their scheduled time, even if the CAP is not functioning. However, depending on which site system is failing, status from the clients might not propagate to the site server, and new programs cannot be advertised to the clients.

  • Software continues to be metered on clients, even if the site server is not functioning.

  • Software and hardware continue to be inventoried on clients, regardless of which site system has failed. However, the inventory data might not propagate to the site server, depending on which site system has failed.

Also, if any component server has failed, it might affect communication with other sites, but otherwise, the site server continues to function with minimal interruption. However, you will receive numerous error and warning status messages to inform you of the problem.

When a site system is not functioning, data starts to accumulate on clients or on other site systems. As soon as the failing site system regains functionality, the accumulated data propagates to the repaired site system, where it is processed in the regular manner.