Chapter 2 - Directories

Article
02/20/2014

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

From the book Designing Distributed Applications With XML, ASP, IE5, LDAP and MSMQ by Stephen Mohr. (ISBN: 1861002270). Copyright ©1999 by Wrox Press, Inc. Reprinted by permission from the publisher. For more information, go to https://www.wrox.com.

In order for the loosely coupled network applications we have introduced to work as we envision, behaving cooperatively, we will need help finding the data and services. The old methods of pointing to a specific server or initializing an application from a file will not suffice in a network undergoing continuous change. Decentralized clients need to get a centralized view of the network from a globally accessible source. Fortunately, a technology which will be new to many PC programmers, and which has actually been some time in arriving, will help us. This technology is directory services.

In this chapter we will learn the basics of directories: what they are, how they function, and how we can reach them. In particular, we will learn a bit about the Lightweight Directory Access Protocol (LDAP) and how to locate a networked object's information in a directory. We will look at the particulars of the Active Directory, Microsoft's implementation of directory services for Windows 2000, as an example of how to use a directory service to locate services as part of our distributed application development strategy. We will give specific examples using a particular component library, the Active Directory Service Interfaces (ADSI) for accessing the Active Directory from script. Finally, we will see how an existing item in the Active Directory schema can be put to use helping our applications find the sources of data they will require.

Although we will gain a certain amount of background information regarding directories and LDAP in this chapter, our focus is entirely on locating one of our ASP-based services when we have only provided a specific class or data. Once found, we want to know how to connect to the service, what language it speaks, and how we pose queries to it.

Windows 2000 is in beta as we go to press. If you do not have access to this software, rest assured you can still make use of the scheme we present. This chapter will provide valuable background information on directories, as well as presenting the technique we shall use throughout this book. The software that accompanies this book, which may be downloaded from https://www.wrox.com and https://webdev.wrox.co.uk/books/2270 , includes a stub version of our directory component. This stub can be used with the rest of the software in the book in the absence of Active Directory software. Even if you do not have Active Directory or some other LDAP-complaint directory service, it is important to plan for this and build applications that are directory ready. We will shortly see why this is so important and why it deserves to be one of the five principles.

Directory Basics

A directory service is much like a telephone directory. Finding a directory is easy; we usually have a copy at home and a copy at work to help us find local telephone numbers, and when we want a number that is outside our area we can call a service provided by the telephone company. We simply provide some information that helps us find the number we are looking for, and when we find the entry it gives us everything needed to reach the party represented by the entry.

Continuing the analogy, a telephone directory contains information about many types of objects. We think all the entries are alike, but we have entries for offices, stores, individual people, even other services that will yield still more information. In addition to telephone numbers, most printed directories provide addresses, maps, and some textual information. A telephone directory has a simple organizational scheme but is able to store information about many things and let us retrieve it.

In computing terms, a directory is a specialized repository of information that describes the nature and network location of devices, users, and services on the network and has some sort of programmatic interface to the repository. One of the key things that differentiate directory services from other types of data store is that they are highly optimized for frequent read access. Although there is obviously a mechanism for writing entries, the information is queried far more frequently than it changes. In addition, we typically do not make ad hoc queries of a directory. We tend to know what we are looking for and either navigate to it in a hierarchical fashion or perform a search on certain specific attributes. It is worth noting that Directories are not object brokers, such as are found in CORBA, although object brokers share some functions with directories. Finally, directories do not generally connect you to particular resources, but rather tell you where to find them and how you should connect.

It is fair to say that directories aren't restricted to a single structure. In practice, however, most directories do adopt a tree structure. This tree structure is very similar to how the file system on your computer is organized. For example, if you open Windows Explorer you see the various drives on your computer. In your hard drive you will find a list of folders, within which there may be other folders, and possibly some files. You can keep drilling down until you just get to a folder containing the files you are interested in.

The tree structure makes it very easy to name things and navigate to them. For example, I could have two files, both about this chapter, on my C drive called Chapter2.doc, but using their pathnames I can distinguish the document:

C:\My Documents\original\Chapter2.doc

where I made notes for what the chapter would contain, and the final version you are reading now:

C:\Books\DDA\Chapter2\Final\Chapter2.doc

These pathnames show the exact location of the file in the tree structure. We can break this pathname down, in this case it is broken up by backslashes, although other types of tree could use forward slashes, commas etc. The folders in the tree's structure can be considered as container objects, each one branches out taking you to a more specific part of the tree, while the files are the leaves on the tree. If the tree is well designed, the containers tell us something about the type of files they contain.

For us, both the leaves and branches on the tree are objects of interest in the network. Unlike a file system, both the leaves and containers of a directory can contain information. When we are talking about a network, instead of using the term folders we often use the term nodes, these nodes within the tree represent objects that contain other objects. Some of the containers are domains, while others are particular computers. For example, I could find my computer on a network using a path like this (which is rather similar to the way I could identify a file on my computer giving the pathname):

WinNT://DOMAINNAME/COMPUTERNAME

The IETF (Internet Engineering Task Force) added the idea of domain naming so those directories written to an early standard, X.500, could be rooted within the domain name system of the Internet. Thus, an object in a directory housed on the fictional MegaWidgets.com server is said to reside within the MegaWidgets domain, which is in itself a part of the com domain of the Internet. Within a domain, other containers may be used to compose a structure that is useful to the owning organization and descriptive of the network. Here is a conceptual view of a generic directory:

Namespaces

The ability to offer a name and receive information about the named object is central to directory services. In fact, we say that a directory is a namespace — a bounded entity in which names are uniquely mapped to information. Since networks can be large and contain many items of interest, we need a formal method of naming the items in our directory. LDAP provides that through the idea of distinguished names (DN), which we meet shortly.

LDAP roots its namespace in the domain name system of the Internet. The top level containers are the top level domain names of the Internet. We can readily make our directory servers accessible to our external partners via the Internet using the proper distinguished name when our directory is so rooted.

We navigate through the namespace by specifying a path through the namespace. Consequently, any object can be identified by a combination of containers and a local name which, taken together, define the path through the directory needed to locate the object within the namespace. Some directories, notably Microsoft's Active Directory, also include the concept of domains. Each domain holds objects and containers for objects.

Directory Servers

Information in a network ultimately comes from a server. A directory service consists of one or more servers providing directory look-ups. These servers must be accessible to all devices and users on the network to be useful. If we require all our clients to know the names of our directory servers, however, we will be repeating the mistake of hard-coding machine names. Everything can change in a network, even the number, nature, and location of directory servers. So, directory service APIs need a mechanism allowing clients to discover the directory servers available to them. For example, Active Directory provides a mechanism, as a part of the operating system, so that any machine can find the nearest directory server. Once found, we can use the directory schema to gain a sense of the network and expand our search to other servers. This is similar to how telephone directory assistance functions in the United States. You can call a single number from any telephone and reach the nearest directory. From there, you can obtain additional information — the area code — that allows you to compose the telephone number for the directory service of some remote area. You do not know where that service is located or how it is implemented, only how to find it, and from it, any other directory anywhere in the world. In this book we will be paying particular attention to Microsoft's Active Directory.

Active Directory

Let's just clear up one frequent cause of confusion, Active Directory and ADSI might have very similar names but they are not the same thing. Active Directory is a directory store, and is just one of the things that you can access using ADSI. ADSI is out now, while Active Directory will be introduced with Windows 2000 Server (although it's actually already available with the Windows 2000 beta).

Active Directory forms part of the Windows 2000 Server operating system. It is intended to replace the NT domains of NT4 and is likely to be a very important directory on your network. Why? Because it's not just a directory, but it is the directory of network resources in a Windows network — which computers are in which domain, what printers, scanners etc. are located where, and who has security rights to do what. In other words, it's a central directory of everything you need to control your network. In fact (at least on the current beta version of NT5), if you decide to promote a server to be a domain controller you do so by installing Active Directory. So running Active Directory is synonymous with being a domain controller.

Netscape Directory Server

We won't actually be using Netscape Directory Server in this book, but it deserves a mention because it is a widely regarded, scaleable, and general-purpose LDAP directory and was available at the time of writing. It traces its lineage to the original LDAP implementation at the University of Michigan. Netscape Directory Server can be used to store whatever you want. It is generally regarded as being highly scaleable and reliable, and conforms to LDAP version 3, an open standard that describes how the data in the server can be accessed. It's possible to access data stored in Netscape Directory Server using ADSI. We'll explain more about LDAP later in this chapter.

Having had an overview of Directory Services, let's take a look at how they are organized, using the concept of Schemas.

Schemas

The schema defines the kind of information that can be stored in a particular directory. In an LDAP directory the schema defines this in terms of the object classes and attribute types that the directory can contain.

Each directory server in a particular network uses a common schema. The schema is made up of two types of rule, which define the schema, and help organize the information in the directory, these rules are:

structure rules
content rules

We've talked a bit about general structure of a directory service, now let's take a closer look at a schema using an example. Here we have a fictional directory for Wrox with details of Schema, Staff, and Books — everything. The diagram below shows a part of this tree:

Structure Rules

At the top level there's an object called Wrox, which is the company's domain. Below it in the tree structure is a container called Staff, which contains details of all the employees in the company. Obviously, we don't want to store information about our books in the same container as the staff, so we need some rules about what is allowed to go where. These rules will be implemented by the directory service, so that, if a client application tries to make a change to the directory that goes against these rules, the directory service will refuse to make the change. These types of rules are known as structure rules. Quite simply, they are the rules that define the structure of the directory.

Content Rules

But we will also want to add detail to the objects. For example, we want to store each member of staff's position, e-mail address, department, etc. with the employee's record. These bits of information are known as attributes or properties. But, if someone new joins, who doesn't understand our system, he might accidentally try to add a release date or ISBN number to a member of staff. To prevent this happening (and any offence being taken due to such a mistake) we clearly need some rules that determine what information is allowed to be stored with each entry. Such rules are known as content rules.

Every entry in the tree is of some object class. It is the entry's object class that defines what attributes (the content rules) that the entry contains. Because the containers and leaves have attributes, information contained in the directory tree is not all held at one level; rather it is spread throughout the nodes in the tree.

So, it is these structure rules and content rules that are collectively known as the schema. In Active Directory each domain's schema is represented by entries stored in the directory itself.

The directory service uses the schema in satisfying namespace queries. Although most schemas will have common entries, such as those for users, computers, and protocols, there is no need for every directory that is connected to the Internet to share the same schema.

This is where we meet LDAP. The purpose of LDAP is to provide a common language, which servers can use to name things. Using LDAP, you can discover information about the schema supported by a particular directory.

Active Directory has an extensible schema. While it supports a rich collection of several hundreds of object classes and attribute types out of the box, with care, network administrators and third parties can define new classes and attributes. This is sometimes done to aid in managing unique resources or applications.

LDAP in General

Given the different implementations of directories across the Internet, a common access protocol and programmatic interface is desirable. That is the role filled by LDAP. LDAP provides a means of talking to a directory service and attempting to obtain data about some object contained therein. Since most directories are hierarchical in nature, LDAP can use a standard format for path names.

Distinguished Names

At the root of LDAP's syntax is the notion of distinguished names. As we have seen, a distinguished name (DN) uniquely identifies an object if it describes a path to that object from a known starting point. A DN is a series of comma delimited name-value pairs.

Domains are designated using the LDAP attribute for a Domain Controller, DC. The two kinds of high-level containers are Organization (O) and Organizational Unit (OU). Below this, all objects are named using a Common Name (CN). Thus, a server belonging to the Customer Service organizational unit of the fictional MegaWidgets company might have the distinguished name:

CN=PrintSrv, OU=Customer Service, DC=MegaWidgets, DC=COM

We can compose a DN for a container rather than an object if we wish to locate a container so as to enumerate all the objects contained within it. There is also the notion of a relative distinguished name (RDN). The RDN for an object or container is that part of the DN which is specific to the item itself. The RDN for our server in the example above is CN=PrintSrv.

We will use DNs throughout this chapter. They are used as part of queries for objects and to bind to a specific object in order to obtain the attributes that describe it.

Directory Access: The LDAP API and COM Components

The LDAP specification includes a C language API that all LDAP compliant directory services must support. We will not delve into this API in this chapter.

If you are interested in LDAP pick up a copy of Implementing LDAP, ISBN 1-861002-211, from Wrox Press.

Our development philosophy places strong emphasis on component software construction. Writing an entire component or plug-in to afford access to the Active Directory would undoubtedly be time consuming, and is not really necessary. Fortunately, Microsoft provides a library of components, the Active Directory Service Interfaces (ADSI), which performs this task for us. As we shall see, we will use ADSI for binding, although we will also need to use another component library, ADO (ActiveX Data Objects), for queries due to current limitations on the use of ADSI from Web scripts. We will retain the LDAP syntax, however, so that readers who wish to use the low-level LDAP API will have a point from which to start.

Microsoft Active Directory

Microsoft Windows NT 5.0 — to be known as Windows 2000 Server when released — introduces an LDAP compliant directory service as part of the operating system, known as the Active Directory. Active Directory uses DNS or Domain Name Service as its naming and location service.

The Domain Name Service (DNS) is familiar to us all, and is defined in RFCs 1034 and 1035. It is designed to provide a human-friendly name for the location of computers on the Internet. DNS names are formed by names separated by dots, for example www.wrox.com. NT4 domains didn't necessarily have any connection with Internet domain names. This is important, because with Windows 2000 it will be possible to form domain trees and give the domains DNS type names, allowing it to integrate well with IP based inter- and intranets.

Active Directory supports the native LDAP API as well as ADSI and ADO queries. The Active Directory store may be replicated among a number of servers for fault tolerance or partitioned across multiple servers for scalability. It inherits the operating system's native security, which defaults to the Kerberos distributed security protocol in this version of Windows 2000. Other protocols, such as SSL and the native security system from earlier versions of Windows NT, can also be used, thereby affording maximum flexibility to system administrators.

Structure

Active Directory scales by joining small-scale structures of computers into larger groupings. This mirrors the hierarchical nature of LDAP directories. This structure begins at the level of domains and continues through domain trees to forests.

Domain

Windows NT security and administration has revolved around the notion of domains since the operating system was introduced. Although the default security model has changed, domains remain a building block for administering and securing Windows NT networks. A domain is simply a group of computers that share a common primary domain controller and use the same security schema.

Typically, domains reflect reliable, high-speed subnets. Domains may be configured to participate in trust relationships, so that a machine in one domain may access an object in another without having security credentials in the second domain. This becomes important as we join domains to create our organizational directory. The Active Directory Tree Manager automatically establishes the appropriate trust relationship when two domains are joined.

In Windows NT 3.x & Windows NT 4 one Domain Controller was set as primary (PDC) and Backup Domain Controllers (BDCs) were used to support the PDC. Active Directory has a multi-master model, in which the authoritative DCs accept change requests and then propagate them directly down the tree. This multi-master model combined with the transient trust relationship makes Active Directory a perfect platform for inter-site replication.

Domain controllers execute the Active Directory service. Depending on how we organize our directory, we may replicate the directory store between domain controllers for fault tolerance or partition the store so that the complete Active Directory contains many more objects than could be accommodated by a single domain controller. It is common for the domain structure of a network to reflect the organizational structure of the business operating the network.

Large domains may be divided into collections of organizational units (OU) to impose a more detailed structure on the directory. OUs may also be used to fine tune access and administration rights. OUs do not, however, affect the hosting of the Active Directory service.

Domains are important in another way. One of the advantages of having a directory is that it allows us to get away from having to remember machine names when we want to find something on the network. If we had to remember the names of our Active Directory servers, we would lose almost all of this advantage. We would be able to discover every machine except the directory servers. LDAP and the Active Directory address this through a technique known as serverless binding. From this starting point, we can walk up and down the directory to discover other domains and domain trees within the directory.

Domain Trees

Just as a company joins groups into divisions, Active Directory allows us to join domains together to reflect a larger organizational entity. User and administrative rights may be granted for all domains or a select list of domains. Not only does this control access by users, it simplifies administration. Responsibility for administering the Active Directory can be delegated to individual domain administrators.

Domains are joined in a hierarchy to create a domain tree. Domains can contain other domains. For example, suppose Sales and Customer Service are domains that have been joined as children of the External Relations domain.

Now suppose an application in the Customer Service domain is searching for an object within its domain. The appropriate DN would include:

LDAP://DC=Customer Service

If it needs to search for an object in Sales, the DN would include:

LDAP://DC=Sales, DC=External Relations.

The application would either know the name of the domain it wishes to search, or would query the directory to obtain the names of all the domains.

So, when do we create a domain tree? All domains within a domain tree form a single DNS namespace. If we are joining networks that have separate DNS namespaces, two domain trees are called for. This has implications for directory searches, as well. A search using the global catalog (which is like an index for the directory) searches the entire directory. If we need to search based on an attribute that is not stored in the global catalog, however, we will only search objects in the current domain tree.

Forest

Obviously, if we can have multiple domain trees in an organization's directory, there must be some overall grouping. This grouping is the forest. While two domain trees in the same forest do not share a DNS namespace, they must share the same schema and global catalog. When domain trees are joined, a trust relationship is established between the root domains of each tree in much the same way as occurred when we joined individual domains into domain trees.

The Active Directory Schema

We have alluded to objects and their attributes. The collection of types of objects — classes — and their attributes is known as the schema for the directory. The schema applies to all levels of the directory. Active Directory contains a rich schema by default. An administrator may extend the schema by creating new classes of objects and new attributes for those classes, thereby extending his directory to become a general-purpose repository for information about networked resources. Classes in an LDAP-compliant schema may inherit from other classes. This allows us to progressively refine the information we publish in the directory by specifying only those attributes that are new to a newly created class. The new class possesses all the attributes of the class from which it was derived.

Global Catalog

Occasionally, we will want to be able to find an object in the directory without knowing its DN. This might occur because we do not know what domain contains the object. Other times, we might wish to find all objects fitting a particular description. Provided we know the value of some key attributes, we can perform a search of the global catalog. The global catalog contains a partial copy of every object in the directory and as such acts like an index to the contents of the directory. For each class of objects, the Active Directory schema specifies a base set of attributes, which is stored in the global catalog. Administrators may add to these sets, and can specify which attributes are stored in the global catalog when creating a new class. While we can add attributes to the default list copied in the global catalog, adding attributes has an adverse effect on storage requirements and indexing time. At the extreme, if we stored all attributes we would have two copies of our directory! If we need to search outside the domain tree of the searching user, for an attribute not copied to the global catalog, we will need to repeat our search in each domain tree. The global catalog is maintained automatically by the directory replication system.

Structure

Any particular object belongs to a single domain. An application executing on a particular machine traverses the directory structure through the use of distinguished names. If the desired object is in another domain, the application discovers the domain root and composes the appropriate DN from that root to the object. Subsets of the attributes of the objects in the directory are also stored in the global catalog. This permits rapid searching of the entire directory without walking the entire tree. Deciding what attributes to store in the global catalog is an important design issue if the catalog is to be kept to a manageable size.

Active Directory also has the concept of sites. These should not be confused with Web sites. Rather, they are subnets, usually for a local facility, like a LAN at a particular operating location. This is not, however, an LDAP structural element so we will not spend time on it here.

Hosting and Replication

Large networks need multiple directory servers for several reasons:

The loss of a single server should not result in the loss of the directory
Loss of a portion of the network topology should not isolate a user from the directory simply because the directory server is no longer accessible from the isolated subnet
Even under normal operation, it is simply not efficient to have all directory accesses traversing the entire network to reach a central directory server

For these reasons, Active Directory allows multiple domain controllers to host the directory service. Keeping the information stored on multiple servers coherent is a task for the replication service. Permitting multiple directories to act as directory servers while resolving replication collisions from other servers is a concept know as multi-master replication.

Domain Controllers

Each domain in Windows 2000 has a primary domain controller and zero or more back-up domain controllers. Each domain controller stores a copy of the domain's directory namespace. Since all computers must connect to the network at boot-up to obtain network resources they will connect with the domain controller, and subsequently will learn the location of all domain controllers in the domain. Therefore the directory is automatically visible to any computer entering the domain.

Replication

Generally speaking, replication is automatic and should never explicitly concern the application programmer. It is the job of the network administrator to establish policies governing the frequency of replication. Changes made to the directory must, however, be reflected in the directory store at all other directory servers within the domain. If these changes affect the global catalog, they must be passed along to all domain controllers in the network.

To be realistic, it is not practical to replicate changes immediately to all servers in a large forest, particularly while clients are referencing the directory (although the changes must be replicated eventually). Moreover, as we shall see in a moment, any of the directory servers can make changes to their particular copy at any time. Thus, there is no guarantee of absolute consistency among replicas of the directory at any given instant. Over time, however, the replicas are reconciled and the directory becomes coherent. Active Directory uses sophisticated mechanisms to ensure replication updates between servers are dampened and do not propagate endlessly across loops in the directory topology. Similarly, great care is taken to resolve collisions — the situation that occurs when two replicas contain differing updates of the same object.

Multi-master Replication

There is no single master domain controller in the Active Directory architecture. Nor is there a single server controlling the replication of changed directory objects between servers in the forest. Instead, all domain controllers hosting the directory service are seen as peers. We say that the Active Directory uses multi-master replication. This affords a high degree of availability as no single server failure results in loss of the directory service. Additionally, clients making changes to directory objects are using a server close to them in network terms. Because there is no need for all clients to use a single server to make changes, the processing load from directory changes becomes distributed across all the servers hosting the directory.

When a domain controller writes a change to a directory object, it also writes an Update Sequence Number as part of a single, atomic transaction. This is simply a sequential series of integers numbering changes to the directory. These sequence numbers are broadcast between servers periodically, allowing servers to request copies of updates they have not yet seen. The sequence numbers are the primary source for determining which version of an object is current. This eliminates the need to enforce time synchronization across a large, distributed network. Timestamps are used, however, as tiebreakers when the sequence numbers are not sufficient to resolve a collision.

Since replication isn't instantaneous, application programmers should respect the possibility of changes to data and obtain all related pieces of directory information at the same time, and directory information should never be cached in the application itself.

Client Access

Hopefully, you are now excited about using directories for maintaining configuration information. Having a robust store managed by the network eliminates many concerns formerly borne by the application programmer. We don't have to worry about configuration files or coordinating registry entries. We also don't have to worry about getting configuration changes on a server application out to all possible clients. How, then, can we get information from the Active Directory?

As we have already seen, the Active Directory is LDAP-compliant. We could therefore use the LDAP API for our purposes. This is a low-level API intended for C programmers. Web development, however, is inherently a high-level programming environment, so there would be loud sighs of relief if we could avoid writing bits and pieces of C code just to access the directory. Fortunately, because Active Directory is LDAP compliant, and ADSI includes an LDAP provider, we can use the ADSI object model to access Active Directory from Web page scripts.

Although we will be using ADSI with the Active Directory, you are not limited to this directory service. You can already use ADSI as your interface to any LDAP-compliant directory with NT4.

Active Directory Service Interfaces

ADSI can be seen essentially as a set of definitions intended to unify directory access, in other words, as a set of common function calls. We shall be using the ADSI LDAP provider, which can be used to access any LDAP compliant directory.

How this works in practice is shown in the diagram above. The client is the application that needs to make use of the data in the directory. The directory service is the directory, together (in some cases) with some software that makes that directory available to the outside world. In between is the ADSI provider. The ADSI provider is the component that is able both to communicate with a particular directory service, using that service's API, and to talk to clients using the standard ADSI methods. You can almost think of it as a language translator that allows clients that only understand ADSI to talk to directories.

The ADSI provider is actually a dynamic linked library (DLL), which means that communication between the client and the provider is extremely efficient. By contrast, the directory service will normally be in a different process, possibly even on a different machine, so talking to it is much slower. This means that it's useful to keep the number of calls to the directory service itself to a minimum. The ADSI methods handle this by caching a lot of information in the client's process, in something called a property cache.

By wrapping each directory object in a COM object, ADSI allows the client to appear as if it's talking to the object in the directory itself, even though it's actually talking to a COM object sitting in its own process space, created by the ADSI DLL. The COM object exposes the same properties as the directory object, and it also exposes methods to carry out useful operations, such as enumerating all its children if it's a container object.

We will be concerned with the LDAP provider here, although a WinNT provider is included with the basic ADSI distribution, which exposes a more limited set of objects from Windows NT 4.0 and Windows 2000 Server domain controllers. We bind to an object in the directory to access its attributes. Binding is the process of providing authentication to the directory in order to obtain access to the information contained within. When we do this via ADSI, the object in the directory is presented as an automation object (a COM object that exposes a dual interface and supports the late binding required by scripting languages). This makes it easy to manipulate directory objects from script in Web pages.

We can use one of two methods to bind to an object from script: the operating system's GetObject() method and the OpenDSObject() method of the IADsOpenDSObject interface. GetObject() uses the current user ID and security credentials when binding to an object. This is usually what we want to do, particularly if we are doing this from the client side of an intranet. When using OpenDSObject() we are required to provide a specific user ID, which we need for server-side access. OpenDSObject() also provides encryption to protect the data as it is transferred between the directory and the calling application.

Binding to directory objects from script is easy, but ad hoc directory searches are a problem for Web developers. ADSI provides the IDirectorySearch interface to help with this, but it is not accessible from script code because it is not a dual interface. Only early-binding languages like C++ can make use of this interface. Fortunately, a database provider component is available within the ActiveX Data Objects library that will let us search the directory using either LDAP or SQL queries from our Web pages.

ActiveX Data Objects and LDAP

ActiveX Data Objects (ADO) are a collection of COM components for database access under Microsoft Windows. An interesting feature of ADO is that simply creating a COM component that exposes certain well-defined interfaces will provide us with access to non-relational sources. This enables us to query the directory as if it were a relational database simply by using the ADsDSOObject provider (DSO stands for Data Source Object). This provider (which comes with the ADSI 2.5 beta at the time of writing) is currently read-only, but that will be sufficient for our purposes. If you require write access, however, you can always use ADSI to bind to any object you find. Unfortunately the ADsDSOObject provider is only available with Windows 2000 beta 2 or later.

A thorough treatment of ADO can be found in ADO 2.0 Programmer's Reference from Wrox Press (ISBN 1-1861001-83-5), but the basic query access we require can be accomplished with a small subset of the ADO object model. As you might imagine in dealing with databases, we will need to create a Connection object specifying the source and user information. If you have dealt with SQL database programming, you will know that data is returned in a Recordset. We may also use an interface that might not be familiar to you, that of the Command object. This component interprets the command language we will be using. The ADsDSOObject provider is unusual in that it will accept either SQL or LDAP syntax. We'll use LDAP in the rest of this chapter to promote familiarity with that syntax, but we'll also give a SQL example in a little while for the benefit of those readers who may be well versed in SQL.

ADSI Basics

Let's dive in and learn about ADSI programming. The basic tasks of ADSI access are:

Locating a server
Binding to a particular object
Obtaining the values of that object's attributes

We also need to be able to enumerate all the objects in a particular container.

We'll work through enough ADSI to satisfy the requirements of our five principles. ADSI and the Active Directory are too rich and complex to fully explore in one chapter. A full examination of ADSI is found in Professional ADSI CDO (ISBN 1-861001-90-8) and ADSI ASP Programmer's Reference (ISBN 1-1861001-69-X) both from Wrox Press.

Serverless Binding

As we noted in our discussion of domains, Active Directory supports serverless binding, it uses the LDAP name rootDSE (which refers to the root of the directory tree on a particular directory server). So, an anonymous user can connect to the domain controller and retrieve attributes of the server with the following code in an ASP:

var rootDSE = GetObject("LDAP://rootDSE");

Response.Write("Domain: " + rootDSE.Get("defaultNamingContext"));
Response.Write("Current Server Time: " + rootDSE.Get("CurrentTime"));
Response.Write("DNS Host: " + rootDSE.Get("DnsHostName"));
Response.Write("Server Name: " + rootDSE.Get("ServerName"));

If the HTTP server is in a different domain from that in which the user logged-in, we could use the same approach on the client to retrieve the client's domain. Alternatively, if we want to explicitly connect to a known domain controller, the MegaWidgets domain for example, we should use:

var rootDSE = GetObject("LDAP:/MegaWidgets/rootDSE");

The first line of this ASP example is the most important, because this is the line which creates the object that implements the interface. We are telling the operating system to connect to a running directory object, creating a new COM object as a wrapper in response to the GetObject() call using our LDAP name as the name of the object. The domain controller presents the Active Directory service as such an object. Consequently, rootDSE is an object representing the domains' directory subtree.

Readers with some experience in COM programming may be interested to note that binding with all of the ADSI providers is implemented in the Active Directory using COM monikers.

On my domain controller, named Vandenberg in the widgets domain, the result looks like this:

Domain: DC=widgets

Current Server Time: 19981106162743.0Z

DNS Host: VANDENBERG.widgets

Server Name: CN=VANDENBERG,CN=Servers,

CN=Philadelphia,CN=Sites,CN=Configuration,DC=widgets

The rootDSE object supports a number of useful properties. The attribute defaultNamingContext is perhaps the most important as it allows us to find our domain; from there we can navigate through domain trees and even the entire directory forest. Note the ServerName attribute. My server, Vandenberg, belongs to the container Servers within the container Philadelphia, which is a site (hence is within the Sites container). A site is part of the network configuration, so Sites is contained in the Configuration container on the widgets domain. (Distinguished names can sometimes tell you more than you ever wanted to know.) One use for retrieving the time on the domain controller could be for reconciling times in specialized applications, or alternatively it could be to impress your friends. The rootDSE object supports the following attributes:

Note that some attributes can hold multiple values concurrently, which we term a multi-valued attribute

Attribute	Description
CurrentTime	Time set on the directory server.
SubschemaSubentry	DN for an object exposing the supported classes and attributes of the schema.
DsServiceName	DN for an object exposing the settings in the directory server.
NamingContexts	Multi-valued, DNs for all naming contexts in the server. Default values are Schema, Configuration, and the domain to which the server belongs.
DefaultNamingContext	DN for the domain to which the server belongs.
SchemaNamingContext	DN for the Schema container.
ConfigurationNamingContext	DN for the Configuration container.
RootDomainNamingContext	DN for the root domain of the tree containing this server.
SupportedControl	Unique Object IDs for extension controls supported by the server.
SupportedLDAPVersion	Multi-valued, major version numbers of the LDAP versions supported by this server.
HighestCommittedUSN	Serial number of the latest change notification to the directory.
SupportedSASLMechanisms	Security mechanisms supported for SASL negotiation.
DnsHostName	DNS address of the server.
LdapServiceName	Service Principal Name for the LDAP server.
ServerName	DN for the server object for this server in the Configuration container.

In the table above, SASL refers to the Simple Authentication and Security Layer, a secure means of binding in which the client and the directory agree on an authentication protocol, like Kerberos, instead of passing cleartext passwords.

Binding to a Directory Object

The technique we used to bind to the rootDSE object works for any object in the directory provided we know its distinguished name. Let's suppose I've forgotten my e-mail URL and wish to recover it the hard way. I can bind to my user object like so:

var rootDSE = GetObject("LDAP://rootDSE");
var me = GetObject("LDAP://CN=Stephen Mohr,CN=Users," +
rootDSE.Get("defaultNamingContext"));

You'll quickly recognize the first line as we are recovering the domain name using serverless binding. In the next line, I bind to my user object. Starting at the lowest level, I know my common name (CN=Stephen Mohr), and I know all user objects are children of the Users container. The rootDSE object gives me the domain name to complete the distinguished name.

We can also bind to the global catalog at one of three levels: domain, domain tree, or forest. In each scope, the keyword for global catalog, GC, replaces LDAP. We simply need to get the right information to compose the distinguished name for each scope. For the domain we can use the following:

var root = GetObject("LDAP://RootDSE");
var domaincat = GetObject("GC://" + root.Get("defaultNamingContext"));

For the domain tree:

var root = GetObject("GC://RootDSE");
var treecat = GetObject("GC://" + root.Get("rootDomainNamingContext"));

The forest is slightly different. It is the root of the entire tree, so we don't have to specify what context to use. The GC container holds a single object to which we should bind. In VBScript we write:

Set root = GetObject("GC:")
For each child in root
Set bindobj = child
Next

Accessing Attributes

Now we know how to bind to any object at any level of the Active Directory we can turn to attributes. Returning to our email example, how do I get my URL once I've bound to my user object? The simplest means of querying for an attribute is one we've already seen. Once we have the object, we simply call the Get() method with the name of the desired attribute. Continuing with the code fragment we saw earlier:

var rootDSE = GetObject("LDAP://rootDSE");
var me = GetObject("LDAP://CN=Stephen Mohr,CN=Users," +
rootDSE.Get("defaultNamingContext"));
if (me != null)
alert(me.Get("mail"));
else
alert("Failed to find me in this domain.");

Once we have bound to my user object, we simply use that object's Get() method to obtain my mail attribute.

Enumerating Objects

Sometimes, of course, we will need to enumerate a number of objects in a container. This will apply not only to containers, but also for multi-valued attributes such as lists. Here's some VBScript code to enumerate all the users in the Users container:

sub OnUsers()
On Error Resume Next
set ou = GetObject("LDAP://CN=Users,DC=widgets")
ou.Filter = Array("user")
for each aUsr in ou
MsgBox(aUsr.Get("samAccountName"))
next
end sub

Here, ou is a variable mapped to the Users container. We apply a filter so that we see only those objects in the container that belong to the user class. Objects of this class possess an attribute samAccountName that is the human-friendly name of the user. We use the enumeration feature of VBScript collections to define a variable, aUsr, which successively takes on the value of each user object in the collection.

As of beta 2 of Windows 2000 and the beta release of ADSI 2.5, there is some question of the ability of JavaScript to bind properly to collections. Consequently, much of our script in this chapter will be written in VBScript to work around this difficulty.

Something similar is needed for multi-valued attributes. Since each multi-valued attribute returned is a collection, we have to enumerate each value of the attribute in turn. This subroutine in VBScript retrieves all the descriptions applied to my user account:

Sub OnGetEx()
dim administrator
dim descList

Set administrator = GetObject("LDAP://vandenberg/CN=Stephen Mohr,+ _

CN=Users,DC=widgets")

descList = administrator.GetEx("description")

For Each Desc in descList
MsgBox(Desc)
Next
End Sub

GetEx() always returns entries as a VBScript array, which is helpful when you do not know whether a value is multi-valued or not.

Directory Queries

So far we've seen how to bind to servers and known objects. In the next few chapters we'll see that we won't generally know which specific server we want. Instead, we'd like to be able to search based on some particular attribute. For that, we need directory queries.

LDAP Queries

We prefer to work with LDAP rather than the proprietary native APIs of different directory services because it is supported by many directory services. Consequently, readers who may not be using the Active Directory can take away some information they can use with their directory service. Moreover, this is an open networking protocol and we are building network applications. LDAP is therefore the 'natural' syntax to use with directory services.

Syntax

The LDAP string for a directory query consists of the distinguished name for the root object of the search and several optional parameters:

<base DN>[;(filter)][;attributeList][;scope][;preferences]

Base Distinguished Name

The distinguished name for the base of our search is what we've become accustomed to in terms of specifying some container or server. For example, if we wished to query the widgets domain on the server named vandenberg, the start of the LDAP string would be:

<LDAP://vandenberg/DC=widgets>;

Filters

The filter allows us to set search criteria. We specify the name of an attribute (cn for the common name or objectClass for the schema of the object for example) along with an operator and a value, which may include literals and the wildcard character *. So, to restrict our search to user objects, we would use a filter of the form:

(objectClass=user)

We could use an asterisk in the place of user, as a wildcard for specifying any value, so long as a value exists. The operators supported are:

Operator Meaning	Operator Symbol
and	&
or	\|
not	!
equal	=
approximately	~=
greater than or equal to	>=
less than or equal to	<=

You can combine these to form complex criteria. Filters are built using a prefix notation in which the operator always precedes the arguments on which it operates. For example, if we wish to find all user objects with a surname of Smith, we would use:

(&(objectClass=user)(sn='Smith'))

Attribute List

This is simply a comma-delimited list of the object attributes we wish to see in our search. If no attribute list is provided, all the attributes of the objects matching the search criteria will be retrieved. Specifying an explicit attribute list allows the Active Directory data provider to ignore other attributes. This improves performance and reduces the amount of data that is returned. For example, if I wanted to see the common name and Active Directory path for some object my attribute list would be:

cn, ADsPath;

Scope

The portion of the directory we are searching could become quite large depending on the base distinguished name we've specified. As with all good search techniques, it is important to restrict the scope of our search in the Active Directory. Three levels of scope are defined. These are summarized below:

Scope Identifier	Meaning
Base	Searches only the base DN provided; can return only zero or one object
OneLevel	Searches the immediate children of the base DN (excludes the object named by the base DN)
SubTree	Searches the entire subtree of the directory whose root is the object named by the base DN (includes the object named by the base DN)

The SubTree scope is most useful if we are searching the home domain of the current user. The size of the subtree should be relatively small, and we can search without concerning ourselves about what organizational units or containers may have been created. This level of search is warranted because we expect the user to find resources closest to home. When he goes outside his domain in search of resources, we would expect him to be less likely to find services tucked away in odd corners. Instead, we would expect to search the immediate children of the domain. This presupposes that directory objects for services of use to a wide community are placed in the top level of a particular domain to make it easy for outsiders to find them (which is a good general rule to follow).

Preferences

Preferences serve to configure the operation of the search. In most cases, the default values are fine. If, however, we are expecting a very large number of possible objects in response to a query, we may wish to set some limits. The ADSI documentation enumerates a number of possible preferences. Here are the preferences supported by the Active Directory:

Preference Name	Data Type	Meaning	Default value
Asynchronous	true/false	true if searches should be asynchronous	false
Deref Aliases	true/false	Resolve object aliases when found	false
Size Limit	integer value	maximum number of returned objects	no limit (0)
Time Limit	integer value	maximum time in seconds to search before returning	no limit (0)
Column Names Only	true/false	return only the names of attributes	false
SearchScope	0 (base), 1 (OneLevel), 2 (SubTree)	directory scope of the search	2
Timeout	integer value	timeout period for the binding	none (0)
Page Size	integer value	size of ADO database pages for results	none (0)
Chase Referrals	true/false	send a referral message to the client when naming contexts are crossed, e.g., the search crosses a domain boundary	false
Cache Results	true/false	Retain search results in a client-side cache	true

The information given for the Deref Aliases preference is for searches using ADO. ADSI searches use an enumerated value for this preference, which can be any of the following:

never(0) — do not dereference when searching
searching (1) — dereference aliases when searching subordinate entries of the specified base, but not when locating the base object
finding (2) — dereference when locating the base object, but not when searching subordinates
always (3) — dereference both when finding the base object and when searching subordinates.

An Example in ADO

Let's pretend I'm still searching for my e-mail URL. We'll do an ADO search against the directory with the Active Directory provider. For clarity we'll go directly to my domain controller rather than finding it through a serverless binding. That way, we avoid having to make another search to locate the domain controller.

Sub OnQuery()
dim dbConn, dbRecordSet
Set dbConn = CreateObject("ADODB.Connection")

dbConn.Provider = "ADsDSOObject"
dbConn.Open "Active Directory Provider",
"CN=SearchUser,CN=Users,DC=widgets", +
"pwd"

Set dbRecordSet = dbConn.Execute("<LDAP://vandenberg/DC=widgets>;
(&(objectClass=User)(mail=s*));cn,
mail;SubTree")

While Not dbRecordSet.EOF
MsgBox(dbRecordSet.Fields(0).Value + " " +
dbRecordSet.Fields(1).Value)
dbRecordSet.MoveNext
Wend
End Sub

We use the standard CreateObject() call for creating an ADO Connection object. We have to tell the connection object which database provider to use, which is ADsDSOObject for the Active Directory provider. Next, we have to open a connection to this provider specifying user credentials. Active Directory Provider is the system name for the Active Directory provider. The next two parameters are the user credentials. This code assumes you have created a user named SearchUser with the password pwd who has access to the directory. Note that we name the user with an LDAP DN.

Once we have an open connection, we obtain a recordset object containing data by executing our LDAP syntax command. We specify a base DN of the widgets domain on the vandenberg domain controller. For a filter, we're looking for any object that is a User object and that has a mail attribute beginning with 's'. We tell the provider that we are interested in the common name and mail attributes so as to reduce the amount of data that is returned. Since I'm not sure where my User object is (remember, I'm having trouble remembering my e-mail URL), we'll search the entire subtree, which in this case will be the entire domain. All this is in the line:

Set dbRecordSet = dbConn.Execute("<LDAP://vandenberg/DC=widgets>;
(&(objectClass=User)(mail=s*));
cn, mail;SubTree")

Finally, we iterate through the recordset by seeing if the recordset has reached the end of file (EOF). If it hasn't, we display the attributes of one row and direct the recordset to move to the next row of data returned.

While Not dbRecordSet.EOF
MsgBox(dbRecordSet.Fields(0).Value + " " + dbRecordSet.Fields(1).Value)
dbRecordSet.MoveNext
Wend

SQL Queries

We can also use SQL as our query language if that suits us. The syntax for a SELECT statement is:

SELECT [ALL] attributelist FROM base_DN WHERE criterialist

The criteria are written with normal SQL syntax, not LDAP filter syntax such as we used in our sample above. Everything in our LDAP example stays the same except the command line:

dbConn.Execute("SELECT cn, mail FROM 'LDAP://vandenberg/DC=widgets'
WHERE
objectClass='User' AND mail='s*'")

Note we are no longer using angle brackets around the base DN. Note also that the SQL string has no place for preferences. If we wish to alter the defaults, we need to explicitly create a command object. Since the default for SQL searches using the Active Directory provider is SubTree, let's rewrite our example to set the scope to OneLevel:

Sub OnQuery()
dim dbConn, dbRecordSet, dbCommand

Set dbConn = CreateObject("ADODB.Connection")
dbConn.Provider = "ADsDSOObject"
dbConn.Open "Active Directory Provider",
"CN=SearchUser,CN=Users,DC=widgets", +
"pwd"

Set dbCommand = CreateObject("ADODB.Command")
dbCommand.ActiveConnection = dbConn
dbCommand.CommandText = "SELECT cn, mail FROM
'LDAP://vandenberg/CN=Users,DC=widgets' WHERE
objectClass='User' AND mail='s*'"
dbCommand.Properties("SearchScope") = 1 rem 1 level - immediate children
Set dbRecordSet = dbCommand.Execute
dbRecordSet.MoveFirst

While Not dbRecordSet.EOF

MsgBox(dbRecordSet.Fields(0).Value + " " +
dbRecordSet.Fields(1).Value)
dbRecordSet.MoveNext
Wend
End Sub

Of course, since we restricted the scope to the immediate children of the base DN object we had to be more particular in terms of our base DN. Fortunately, all this searching refreshed our memory.

Network Applications and Directories

We now have a working knowledge of the Active Directory, LDAP, and how to manipulate and search the directory through script code. Remember from the last chapter that the second Principle of Cooperative Application Development is:

Services will be discovered by querying directories.

Remember, this principle is intended to protect our applications not only from changes to the location of resources such as services, but also to shield us from having to know what services deal with what kinds of business problems. You want to tell the directory what problem you want to solve and have it tell you what service will help you and where to find it.

Here we are using our definition of a 'service' — a programming module bigger than a component, usually implemented as a server-side ASP or CGI script. You should not confuse this with the operating system concept of services.

Generalizing, there are three points involved in making this principle work:

We can describe our services by using some class of objects in the directory
With an appropriate search, we can find objects representing the services that supply the data we need
Once we have these objects, we can look at their attributes and find out how to access them

To make this work in practice we will need to establish some simple conventions.

Service Binding

We will frequently talk about binding to a service in the course of discussing networked applications. What does this mean? Generally speaking, this means that a client application locates the server hosting the service, establishes a connection, and provides some specific parameters to establish a conversation.

In our expansion upon The 5 Principles of Developing Cooperative Network Applications, we will particularly mean:

Finding an HTTP server and some page on it
Finding out enough information to send a query to that page

As you can see, these correspond to the second and third points we just raised as being important when discovering services by querying directories. The result of this is that some desired information is retrieved in a format we are prepared to accept.

Discovering Services

Over the course of the next few chapters, we will develop a method of talking about objects through the use of the Extensible Markup Language (XML). But don't let that throw you, for now, it is enough to say that we want to offer services implemented as Web pages, usually Active Server Pages (ASP) generating the data we need. Our client-side pages will request the pages that provide a service through HTML forms or other mechanisms, and we will use XML to reconstitute the objects for which we asked.

Each such service page will be said to speak one or more 'vocabularies'. Our clients will query the Active Directory for the vocabulary they need. Sometimes, our clients will not be able to find an exact match, but our technique will permit it to use a vocabulary that overlaps with the one it wants. As you can imagine, this really strengthens the sense of cooperation and flexibility that we are aiming for. To achieve this, each service will list all the vocabularies it generates in the Active Directory. Clients will query first for the desired vocabulary, then, if that vocabulary is not found, go back to the directory and query it for a vocabulary that is good enough.

Vocabularies

Each vocabulary has a name. XML, as we shall see, is a language for defining other languages. Each such language is called a vocabulary. Thus, our queries will look for the name of the vocabulary as an attribute of some class in the Active Directory.

Existing Classes

The default Active Directory schema contains literally hundreds of object classes and attribute types, including a class called Connection-Point. This is the basis for all objects and classes to which some other object can connect. It is an abstract class, which means we cannot actually create any directory objects from it. There is also a derived class, Service-Connection-Point, which is described as holding binding information that allows one to connect to a service in order to make use of its services. Service-Connection-Point might seem to be ideal, however it is actually intended for use by for major services like LDAP itself. Also, servers hosting network services like HTTP and LDAP publish records to Domain Name Servers that help DNS map service names, like www.somesite.com, to an IP address. So, using this class properly would mean tinkering with the enterprise's DNS server. We'd like to avoid this. Should we, then, consider extending the default Active Directory schema?

The first thing you run into when extending the schema is the need for an object identifier (OID), which is a globally unique number that identifies the class to the directory. Using the OID properly requires applying to an international standards organization for a block of OIDs. This is not strictly necessary if we won't be exposing our directory service to outside clients, and our applications distinguish between internal and external searches. Providing that this is the case we can get away with using any OID not in use on the network. Such an approach, however, is definitely working around the rules. Clearly, extending the directory schema by the rules is a rare and difficult process. Extending the schema also means creating a class that will become a permanent part of your organization's directory. This may entail more authority than you can muster. It also has ramifications for storage space and search performance. For these reasons, schema extension is often discouraged. We need not give up hope, however.

Our Usage of Service-Connection-Point

Service-Connection-Point has a most interesting attribute called Service-Binding-Information. This is implicitly defined information that describes how a client connects to the service. Since applications are assumed to have shared knowledge regarding the use of a service, it is not much of a stretch to assume our clients have knowledge that our services are not well-known services and, therefore, will not bother the DNS server for service records. We can thus get a free ride on Service-Connection-Point — it is always found in the Active Directory and is (in spirit at least) intended for the sort of thing we are doing. While purists may question our usage of this class, it is an effective ad hoc solution.

What are the things we need to know about our object servers? There are three things we definitely need:

The URL of the page serving the information
A page where users can get more information about the service, including how to query it
A list of the vocabularies our service generates

The first two are accommodated by attributes of Service-Connection-Point. We shall use WWW-Home-Page for the URL of the page serving the information and WWW-Page-Other as the page where users can get more information about the service, including how to query it. For the last item we shall use Service-Binding-Information, a multi-valued attribute. We shall also use the Service-Class-Name attributes of the Active Directory schema to provide a single value we can use as a filter. All attributes are strings.

Adding a new Service Object to the Directory

We can use a utility called the Active Directory Browser (adsvw.exe) to add new objects to the Active Directory. Upon starting this utility, you must specify in the first dialog box that you want a new object, not a search. Let's illustrate this process by adding a hypothetical service that generates documents representing customers maintained in a database by the Customer Service department of some hypothetical company. We simply navigate to the Customer Service OU in our directory and click on Add Item on the Edit menu. This gives us the following dialog box:

Note that we've given the service a CN and specified the class of our object. Also note that serviceConnectionPoint is the LDAP display name format of the class we know as Service-Connection-Point. Once we have clicked OK, Active Directory adds this object to the Customer Service OU. We can then modify the properties of the object to reflect where the services are found. Suppose we generate the customer documents with the page customers.asp in the custsrv site on our server called vandenberg. We explain the usage of this syntax to programmers on the page CustomerService.html. We select url (the LDAP display name for the WWW-Home-Page attribute), provide the URL for the customers.asp page, and click Change.

Note the CN=customerPersonService object in the tree structure in the left-hand pane of the figure above. We do the same to assign the value vandenberg/custsrv/CustomerService.html to the WWWHomePage (or WWW-Page-Other in the descriptive form) attribute. Now assume that customers.asp generates data in two syntaxes: a basic customer vocabulary and a more detailed custsrvcustomer vocabulary that is specialized from the basic vocabulary. The nature of these vocabularies is not important for the moment. Their meaning will become clear over the next two chapters, but assume for now that they are merely labels denoting a data format. If you select the serviceBindingInformation attribute, provide customer, but click Append. Now change the value to custsrvcustomer and click Append once more. The attribute serviceBindingInformation now contains a list of the two vocabularies. Finally, select serviceClassName and provide the name xmlCustomerServicePerson. We have completed specifying the minimal set of information for our new service.

*We will be using the Extensible Markup Language (XML) as our data representation format. This will be described in the next chapter. For now, let us simply agree to begin all service class names with the prefix 'xml'. This will give us an easy filter criteria (*serviceClassname=xml* for example) to differentiate our services from any other service entries.

Service Location Scriptlet

Now that we know how to locate information in the Active Directory, it is time to address the problem of locating servers that speak a vocabulary appropriate to a particular task. Ideally, we'd like to build a reusable component to do this for us. We could write a COM object in C++ or Java, but fortunately we won't need to resort to such extreme measures. Internet Explorer 4.0 and later supports a technique for exposing script as components. We will use this technique, which is called DHTML Scriptlets, throughout this book to build a toolkit of components that will help us implement our philosophy.

What are DHTML Scriptlets?

Scriptlets are HTML pages containing DHTML script code. We can embed them in other pages much as we would a COM component. As a matter of fact, Internet Explorer will load a second instance of its browser COM component in order to load our scriptlet page. This host browser exposes the script functions as component methods. As a result, once we write some useful code, we can reuse it as a component simply by embedding it in a new page and scripting the embedded scriptlet page as if it were a COM component — which, from the containing browser's point of view, it is.

The treatment of DHTML scriptlets presented here is sufficient for our purposes. A complete treatment of the subject may be found in Instant DHTML Scriptlets from Wrox Press (ISBN 1-861001-38-X).

Exposing an Interface

Suppose we have a page full of script functions that we wish to expose as a scriptlet object. Some of the functions and variables will be the methods and properties of the object we wish to offer to other scripts. Others will be private functions and variables necessary to the implementation of the public interface. Scriptlets use four simple rules for naming functions and variables in order to expose an interface:

A variable with page scope (one that is declared outside any functions) and the prefix public_ is exposed as a read/write property
A function with the prefix public_get_ is exposed as a read-access property
A function with the prefix public_put_ is exposed as a write-access property
A function with the prefix public_ is exposed as a method

Consequently, from VBScript, we might have the following examples:

Rem a read/write property
dim public_size = "large"

Rem private variable holding the value of the color property
dim privateColor

Rem read accessor function for color property
function public_get_color()
get_color = privateColor
end function

Rem write accessor function for color property

function public_put_color(newValue)
Rem insert range and validity checking here
privateColor = newValue
end function

Rem ColorBook method
function public_ColorBook()
Rem implementation code here
end function

JavaScript Objects

JavaScript offers a simpler approach than VBScript. JavaScript actually embraces the notion of objects, although not as elegantly as Java or C++. We declare an object, then set its methods to some functions we've written. This sounds more confusing than it really is. Suppose we want to have a class named Monkey, with methods named Eat, Sleep, and Drink. First, we write functions implementing the behavior we want for each method. To keep things clear, we'll use the same names as the methods:

function Eat(food)

{

// Some implementation code here
}

function Drink(beverage)

{
...
}

function Sleep()

{
...
}

Now we write a constructor function with the same name as the class, which in this case is Monkey. Within the constructor, we use the keyword this to refer to this object. JavaScript allows us to declare methods by using them in an assignment, but from where does the implementing code come? If we assign the functions we've written to the newly declared methods, the functions will be called whenever some user of a Monkey class instance calls the method, say, with the line mySimian.Sleep().

function Monkey()
{
this.Eat = Eat;
this.Drink = Drink;
this.Sleep = Sleep;
}

There's one more thing we need to do to make this class accessible as a scriptlet in JavaScript. We must declare a variable named public_description as an instance of our class. This must be a global variable, so be sure to declare it outside the scope of any function. This will limit us to one class per scriptlet page because public_description is a keyword variable:

public_description = new Monkey;

Properties can be exposed using either of two approaches. If you want to expose an internal variable directly, you can do so with the following line in the constructor function:

this.property = variable;

Alternatively, if you want to expose a variable with some validity checking, or if you wish to expose a calculated value as a property, you can supply accessor functions. These take the form of the public property name preceded by the word get_ or put_. Thus:

this.get_propertyname = readfunctionname; // read access
this.put_propertyname = writefunctionname; // write access

Therefore, our Monkey could expose a preference for food with the following:

var myfood = "banana";

function Monkey()
{
this.Eat = Eat;
this.Drink = Drink;
this.Sleep = Sleep;
this.favoriteFood = myfood;
}

We could even have a very fickle Monkey that randomly states a preference:

function Monkey()
{
this.Eat = Eat;
this.Drink = Drink;
this.Sleep = Sleep;
this.get_favoriteFood = randomFoodSelection;
}

function randomFoodSelection()
{
var fRandNum = Math.random();
if (fRandNum >= 0.5)
return "banana";
else
return "beluga caviar";
}

Note that our last example creates a read-only property because we have provided no put_favoriteFood function.

Embedding a Scriptlet in a Web Page

Using a scriptlet in another page is very similar to embedding a typical COM component. Again we use an <OBJECT> tag, but there is a particular MIME type to specify, where we have type=text/x-scriptlet:

We give it an ID so we can refer to it in script. Since this particular scriptlet has no visible interface, we give it dimensions of zero. Most importantly, we list the type attribute:

type=text/x-scriptlet

This attribute tells Internet Explorer that it is dealing with a scriptlet. Now it needs to know what HTML page to load to implement the scriptlet, so we add a <PARAM> tag:

When we want to use the scriptlet in some script code, we refer to it by its id attribute:

function OnSomeButtonPress()
{
...
mySimian.Eat();
}

Directory Walker Example

Now we know how to manipulate the Active Directory, build DHTML scriptlets, and create objects in the Active Directory to represent the services in our development philosophy. If all our Web developers had to know this in order to gain the advantages of using the Active Directory, they'd be less likely to use it. To get around this we can build and test a DHTML scriptlet that hides most of this. So long as they are aware of the basic concepts of the Active Directory, our component will give other developers the ability to obtain a URL for a service providing the vocabulary they specify.

The Interface

We want the user of our component to be able to search for a service given a vocabulary. The component should be able to look in the user's home domain as well as his domain tree. We could give him the ability to look at the entire forest, as well, but that seems dangerous. If everyone searched there first, the directory service might suffer an unacceptable performance burden. We'll provide the user with the ability to root his search in some arbitrary container, so a developer with better knowledge of ADSI and directory concepts would be able to search the forest.

We also need to provide some utility properties. Security is an issue, so we provide read-write access to a user DN and password. You can create a stock user with read-only access or allow the user to provide this. We also expose the DN for the user's home domain as well as the DN for the root of the user's domain tree. This is useful for troubleshooting, and it permits a programmer with knowledge of ADSI to navigate through the directory.

So let's take a closer look at the interface. First we will list the properties and methods, then look at how to implement them. Here are the properties of the Directory Walker:

Property	Description
User	user distinguished name
Pwd	user password
HomeDomain	DN for the current user's home domain
TreeRootDomain	DN for the domain that roots the directory tree containing the current user's home domain

And here are the public methods of our component:

SearchHome()

Searches the current user's home domain for a service providing a named vocabulary.

Parameters	Description	Returns
sVocabulary	Case sensitive name of the vocabulary for which to search	URL for the service providing the vocabulary, or an empty string

SearchTree()

Searches the domain tree of the current user for a service providing a named vocabulary.

Parameters	Description	Returns
sVocabulary	Case sensitive name of the vocabulary for which to search	URL for the service or an empty string

SearchDirectory()

Searches the container specified by the user for a service providing the specified vocabulary.

Parameters	Description	Returns
Sbase	DN for the container rooting the search	URL for the service or an empty string
sVocabulary	Case sensitive name of the vocabulary for which to search	URL for the service or an empty string

Implementing the Properties

Having seen what functions and properties we make available, how do we actually implement the functions providing access to our properties?

User and Pwd

Remember that User is the user's DN and that Pwd is the user's password:
dim sUser

sUser = "CN=Administrator,CN=Users,DC=widgets"
dim sPwd
sPwd = ""

function public_get_User()
public_get_User = sUser
end Sub

function public_put_User(sNewValue)
sUser = sNewValue
end Sub

function public_get_Pwd()
public_get_Pwd = sPwd
end function

function public_put_Pwd(sNewValue)
sPwd = sNewValue
end Sub

Note that we've declared some private variables and given them default values. The Administrator user is common to all Windows NT and Windows 2000 installations by default, although hopefully you've changed the default, empty password on yours. We wrap these with generic accessor functions; we might wish to do some specialized error checking or attempt to bind to a given user DN to ensure it exists. For clarity, though, we'll simply accept what the client gives us.

HomeDomain

Now consider the property for the user's home domain. This is actually determined when a client of this scriptlet tries to read the property:

dim sHomeDomain
sHomeDomain = ""

function public_get_HomeDomain()
dim rootDSE
sHomeDomain = ""
On Error Resume Next

Set rootDSE=GetObject("LDAP://rootDSE")
sHomeDomain = rootDSE.Get("defaultNamingContext")
public_get_HomeDomain = sHomeDomain
end function

This should be familiar to you from the example of serverless binding, which we saw earlier. We bind to the user's domain controller using serverless binding and retrieve the defaultNamingContext property.

TreeRootDomain

We do something similar for the root of the user's domain tree as we did for the HomeDomain:

dim sTreeRoot
sTreeRoot = ""

function public_get_TreeRootDomain()
dim rootDSE
sTreeRoot = ""
On Error Resume Next

Set rootDSE = GetObject("LDAP://rootDSE")
sTreeRoot = rootDSE.Get("rootDomainNamingContext")
public_get_TreeRootDomain = sTreeRoot
end function

Once again, we bind to the domain controller, but this time we ask for the rootDomainNamingContext property. We do not, however, provide public_put_ functions; these are read-only properties.

Implementing the Methods

Having seen how to implement the properties, let's go on to look at the methods.

SearchHome()

Here's the code which implements the SearchHome() method:

function public_SearchHome(sVocabulary)

dim sBase, sURL

sBase = "LDAP://" & public_get_HomeDomain()
public_SearchHome = DoQuery(sBase, "SubTree", sVocabulary)
end function

We compose the DN for our search by using our accessor function to retrieve the current user's domain. With that in hand, we call a utility function, not exposed through our interface, to do the actual search. We'll specify SubTree scope so the search will automatically recurse through all containers below the DN. This may or may not be an effective solution depending on the specific layout of your directory.

SearchTree() and SearchDirectory()

We do something similar for the other two methods, SearchTree() and SearchDirectory():

function public_SearchTree(sVocabulary)
dim sBase

sBase = "LDAP://" & public_get_TreeRootDomain()
public_SearchTree = DoQuery(sBase, "SubTree", sVocabulary)
end function

function public_SearchDirectory(sBase, sVocabulary)
public_SearchDirectory = DoQuery("LDAP://" & sBase, "SubTree",
sVocabulary)
end function

DoQuery()

Finally let's see how DoQuery() works:
function DoQuery(sBaseDn, sScope, sVocab)

dim dbConn, dbRecordSet,sCmd
DoQuery = ""
On Error Resume Next

Rem Connect to the ADSI service provider for ADO
Rem User specified must have appropriate access rights
Set dbConn = CreateObject("ADODB.Connection")
dbConn.Provider = "ADsDSOObject"
dbConn.Open "ActiveDirectoryProvider", sUser, sPwd

Rem Filter specifies a service under the convention set forth in the book
Rem (Chap. 2)
sCmd = "<" & sBaseDN &
">;(&(objectClass=serviceConnectionPoint)
(serviceClassName=xml*));url,serviceBindingInformation;" & sScope
Set dbRecordSet = dbConn.Execute(sCmd)

Rem Iterate through the vocabularies of each service found
Rem and see if it matches the vocabulary we are looking for
While Not dbRecordSet.EOF
For Each vocabulary In dbRecordSet.Fields(1).Value
If sVocab = vocabulary Then
Rem Value is always a collection, hence this hack
For Each child In dbRecordSet.Fields(0).Value
DoQuery = child
Next
End If
Next
dbRecordSet.MoveNext
Wend
end function

The line On Error Resume Next allows us to continue processing if VBScript throws a runtime exception. Otherwise, processing would grind to a halt. We're familiar with the mechanics of opening an ADO database connection by now, but look at the composition of our command string:

sCmd = "<" & sBaseDN & ">;(&(objectClass=serviceConnectionPoint)
(serviceClassName=xml*));url,serviceBindingInformation;" & sScope

We have to wrap the DN for the root container of our search in angle brackets. Next, we provide a filter based on our convention: serviceConnectionPoint is the object class and the serviceClassName must start with the xml prefix. We are interested in the URL and serviceBindingInformation attributes of any objects we find. Finally, we tack on the scope passed in from the public method.

Things become a bit more complicated once we have a record set. The basic iterative loop should be familiar:

While Not dbRecordSet.EOF
For Each vocabulary In dbRecordSet.Fields(1).Value
If sVocab = vocabulary Then
... Rem some action here
End If
Next
dbRecordSet.MoveNext
Wend

Our fields are URL and serviceBindingInformation (zero and one, respectively, in the Fields collection). Since the serviceBindingInformation attribute is multi-valued, we need to iterate through each attribute value found and compare it to the vocabulary name for which we are searching. After evaluating a row in the record set, we move to the next one.

There is, however, a small wrinkle. Each URL attribute is returned as a collection, although this is a single-valued attribute in the schema. Consequently, we have to get at the child value and return it as the value of the function. If a match is not found for the vocabulary, the default value (an empty string which we set at the outset) is returned.

If sVocab = vocabulary Then
For Each child In dbRecordSet.Fields(0).Value
DoQuery = child
Next
End If

Remember, the DoQuery assignment only works because we know there is only a single value for this attribute. That's it then, we have covered the entire implementation of our DHTML scriptlet.

Using and Testing the Directory Walker

We create a test page using the Directory Walker DHTML scriptlet by making an HTML page with the following <OBJECT> tag:

The type attribute is the standard MIME type for DHTML scriptlets, text/x-scriptlet. It has no visible interface, so we give it zero dimensions all around. The URL attribute is set to DirectoryWalker.html, which is the name of the page containing our scriptlet code.

Here's some JavaScript code that searches for the customer vocabulary in the user's home domain:

function OnHome()
{
walker.User = "CN=Administrator,CN=Users,DC=widgets";
walker.Pwd = "";
alert(walker.SearchHome("customer"));
}

We set the User and Pwd properties of the scriptlet, then call SearchHome() with the name of the vocabulary. Given the way we defined the service object (we used the url attribute to record a URL for our service, and the serviceBindingInformation attribute to record the XML vocabularies the service can generate) in the directory, we get a response of vandenberg/custsvc/customers.asp. That will allow a client page to retrieve some customer information by formatting a query using a known convention we will see in the next few chapters and sending it as part of a request for customers.asp.

Summary

Directory services are important to networked applications. Use of an LDAP compliant directory simplifies our programming task and helps us reuse existing services in our loosely structured network by giving us a mechanism for locating services and examining their capabilities. In particular, we learned the following:

The nature and purpose of directory services
The rudiments of LDAP
Basic capabilities of the Active Directory in Windows 2000 and some of its schema
Basic ADSI binding and directory searches using the Active Directory provider
Conventions for representing our Web services in the Active Directory

We are demanding a commitment to directory usage from our organization as a part of our Web development philosophy. In return, we learned about DHTML scriptlets and built a utility component that hides much of the ADSI-related code and our conventions from other Web developers. Using this component, Web development teams should be able to insinuate their applications into the fabric of the organization's computing environment.

We've purposely skirted the topic of how our services will deliver data across the network. In passing, we said we would use XML. Now it is time to take up that topic and build more tools.

We at Microsoft Corporation hope that the information in this work is valuable to you. Your use of the information contained in this work, however, is at your sole risk. All information in this work is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Microsoft Corporation. Microsoft Corporation shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages. All prices for products mentioned in this document are subject to change without notice.