Chapter 2 - Available, Scalable, and Secure
|Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.|
From the book Enterprise Application Architecture With VB, ASP and MTS by Joseph Moniz. (ISBN: 1861002580).Copyright ©1999 by Wrox Press, Inc. Reprinted by permission from the publisher.
For more information, go to http://www.wrox.com.
A good friend of mine cautioned me that the word enterprise was becoming a bit overused – he likened it to other marketing-speak terms like industrial-strength, new-and-improved, etc. In many ways, I have to agree with him on that point, so during this chapter, we will work to develop a common understanding of the concept of an Enterprise Caliber System by defining some of the things we should expect from such a system. We will lay down a few, very broad, design strokes that we will revisit again and again throughout the remainder of this book. Those strokes are available, scalable, and secure.
As we work through this chapter and the next, we will consider some of the design options available to us for developing an enterprise infrastructure. As we do this, we will strive to create an environment that enables us to distribute the processing load of the system across as many machines as possible. Remember that distributed architecture means that we need to distribute the processing – not that we distribute the data. The consummate system would enable true parallel processing to take place across the entire system. (If we were to use only serial processing then each process would have to wait for the one before it to complete before it could begin - even if could perform most of the task without referring to that previous process.) Of course, this is an ideal. But, each time we inch a little closer towards the ideal of true parallel processing, we will have taken a great stride towards designing a system that is more available, more scalable, and in many ways more secure.
As with any journey, we will begin this one with a single step. We will learn what things we need to do to make a single machine robust enough to handle its share of the enterprise load. First, we will pass through the perhaps foreign worlds of hardware components and network connections. We will take the time to understand how the basic elements of a single machine can be made to work together to provide fault tolerance and maybe even some slight improvements in scalability. As we do this, we may learn a thing or two about the underlying hardware, but our goal here is really to draw some parallels between the capabilities of the hardware and the demands our software will place upon it. As we come to know those capabilities, we will be better able to take advantage of all that the infrastructure has to offer.
Once we have developed a kinship with the single enterprise caliber machine, we will take another step towards the ideal of true parallel processing. We will begin a long and deliberate dissection of the monoliths that we currently refer to as applications. In order for an application to distribute its processing, that application must be designed in a way that makes this possible. We will learn to see applications as many complimentary pieces that work together rather than as a solitary monument with a singular purpose. We will learn that there are many natural processing divisions and that we only need to find these divisions to take advantage of the parallel processing opportunities they present. As we come to understand that applications, systems, and enterprises are really a harmony of processes rather than the monotony of a monolith, we will be able to take on other enterprise-wide challenges with this knowledge.
We may even begin to view something as substantial as enterprise security as a problem that is better solved in pieces. Once we have pierced through the security industry's subterfuge to find this view, we will discover that we are not alone with our vision. We will learn that Windows NT achieved its C2 security classification (from the US Department of Defense) partly because it handles security in a similar fine-grained manner. Rather than attempting to control the whole of the enterprise using a monolithic approach, Windows NT implements security at the object level. We will endeavor to emulate this world-class model by building upon and using its strengths from within. This inner strength approach will give us a system that becomes more and more difficult to breach as an intruder gets closer and closer to the data. So, rather than taking the approach of designing a thin veil over our enterprise, we will learn to think of security as something that is ingrained into every aspect of the enterprise.
I understand that it may be difficult to think about security in particular and distributed architecture in general as something that permeates a system, an enterprise or an organization. But it shouldn't be. In the last chapter we talked, at a very general level, about techniques we can use to ensure that each tiny portion of our system is capable of managing its own affairs. We talked about building data objects that could handle their own security from within. We considered how we might deploy an army of these highly capable data objects throughout our system. We realized that these objects could each take on a portion of the overall processing load that our system demanded. Through this discussion, we discovered that we could distribute processing without necessarily having to distribute data. The last chapter was a discussion. In this chapter, we will begin to assemble the foundation for our distributed processing system. One of the corner stones of that foundation has to be fault tolerance. The slickest parallel processing system in the world won't do an organization a bit of good if it isn't available for service.
On This Page
An enterprise's resources must be available. Do you remember the last time the electricity went out? I'm sure, that at first it might have seemed kind of exciting. The last time the electricity at my home went out, I spent a couple of wonderful minutes watching our dog Dakota chasing the business end of my laser pointer around the living room. This little diversion was interesting for a while, but before too long, even the sight of the deranged dog chasing that little red dot 4 feet up the wall lost its appeal. It was then that I realized how little I think about the electricity in my home or office. In fact, I really don't think about it at all; I just expect to be there. When I pick up the handset of my telephone, I am never surprised to hear a dial tone – I have come to depend on it.
Most organizations have come to expect this same level of availability from their mainframe systems and enterprise-level systems (read enterprise-level in this context as the old guard of enterprise-level systems Unix, AS-400, etc). The information they manage is often mission-critical in nature, and great care has been taken to ensure that the resources these systems provide are available without failure. Considering the cost of the systems and the importance of the information they contain, it is not surprising that availability is such a huge concern.
It is surprising, however, that most of the organizations I have worked with do not seem to expect the same level of availability from their PC-based systems. Perhaps this is because in the past, most of the applications developed for the PC platform have not been considered real mission-critical applications. Maybe at that time, it wasn't economically prudent to consider things like RAID, hot swappable power supplies, and entirely redundant servers, etc.
That time has passed.
This book is about the new enterprise systems and although their ancestry links their hardware with the eight-oh-eighty-eight processors of old, the systems – both the hardware and the software – we will consider in this book are not mere PCs. The applications these systems are called upon to host are mission critical applications. This means that, as professionals, we have a responsibility to ensure that these systems are available; we must consider things like fault tolerance, disaster recovery plans, and more.
Although the first section of this book places special emphasis on the hardware/operating system (OS) portion of the enterprise, I must caution you that it is not possible to create an enterprise caliber system with hardware and operating system upgrades alone. Even the best hardware/OS available is not 100% immune to lightening strikes, tornadoes, and other natural or man-made disasters. We need to add several software elements into our system to ensure a reasonably high degree of availability.
In this chapter, we will begin to look at some basic techniques you can use to ensure that your enterprise's resources will be available. It makes me very proud that my teams compare our system to Federal Express – "when it absolutely positively has to be there". This type of high-availability system is really not that difficult to design these days. Microsoft, Compaq, Dell, and host of other companies are working round the clock to develop the pieces of the infrastructure that make this possible. All we really need to do is to learn how to put those pieces together. I suppose that we could approach that task in a rather dry manner. I am certain that we could develop some cost to benefit ratio equations, but I have always been the kind of engineer that deals best with practical matters. I have learned to depend upon something my father used to call common sense. In other words, we are about to take a common sense approach to the problem of fault tolerance – the expensive term for availability.
The most certain way to ensure that a resource is continuously available is to have more than the required amount of resources available at all times. It is a safe bet that you have a spare tire for your car. This spare tire gives your car a level of fault tolerance. If one of the tires goes flat, you can replace it and be on your way in a matter of minutes. When it comes to tires, fault tolerance makes sense, but most of us don't have spare engines, fenders, or bumpers for our cars, do we? No, of course not.
Fault tolerance, like everything else in this world has a price. It makes sound economic sense for us to have a spare tire for our car. It costs less (in most cases) to have a spare tire than it would cost to get our car towed to a place where we could get the flat tire repaired or replaced. This is especially true if we amortize the cost of the spare over the life expectancy of the set of tires. We cannot say the same thing (in most cases) about an automobile's engine. The benefit of having a spare engine probably doesn't justify the cost of the spare engine. If you think about it, there is something else to consider in addition to the cost. One difference between tires and engines is that (in most cases) tires will become flat more often than engines will become inoperable.
There are a couple of other things for us to ponder when we consider fault tolerant tires vs. fault tolerant engines. The first is the ease of installation. Can you imagine the difficulties of carrying a spare engine around with you? How about performing an engine replacement on the side of a busy throughway? The second thing may be a little less obvious. One of the things we do with tires is to rotate them. When we rotate our tires, we remove the spare from the trunk and replace one of the currently active tires with the spare. Then we move the tire that has become inactive into the trunk. This practice lowers the amount of wear and tear on all 5 tires by distributing the total amount of wear and tear, more or less, equally between all 5 tires including the spare. It is difficult to imagine a comparable scenario for an engine.
Yeah, I know, What the heck do tires and automobile engines have to do with designing an Enterprise Caliber System? Well, let's think about the ways we might make an enterprise system fault tolerant the same way we looked at the tires and the engine. We can do that by examining each of the pieces of the system and thinking about whether or not it makes sense to have a spare. First let's make a sample list of pieces to consider. A partial list might look something like the following (I am not trying to handle all of the things you might need to consider here. We will just go over some of the things we might consider in order to develop a common sense way to answer these types of questions in the future.):
Data stored in the system
Software (OS and applications) loaded onto the system
Disks (storage areas)
Power source (think of an outlet in the wall)
System chips/parts (CPUs, memory, motherboard, etc)
The first item on the list was data. You may not consider backing up data files a fault tolerant device. But if you think about it, that is exactly what it is. We don't really need to do a cost/benefit analysis on this one. Not backing up the data in the system is like driving cross-country with a flat spare tire. We can almost take it for granted that, the cost of performing regular backups of the data in the system is less than the cost of the down-time/data re-entry that we would have to perform otherwise. I have included this category just to bring the concept of fault tolerance down to earth where it belongs. One of the best things we can do to gain a level of fault tolerance is regular back-ups of all of our data.
Ditto the data remarks. But, let me add that there are several different techniques that we can use to back up data, application, and system software. Some of these techniques are so inexpensive, that it may make sense to provide several levels of back-ups. Let me give you an example: On our production systems, we perform a primary backup to a disk-array on another machine. Then we back-up that disk to two different tapes. The last step we take is to remove one of the tapes completely off the site. The second disk gives us immediate access to the last backup. The on-site tape is inexpensive insurance for the on-site back-up disk. We can restore from the on-site tape almost as fast as we can restore from the on-line disk. The final level of protection we afford our back-ups is the tape that we store 40 miles off-site. We do this to ensure continuity in case of catastrophic occurrences like fire, tornadoes, nuclear winter, etc.
This entire backup strategy is very inexpensive, and as long as both sites are not completely destroyed we have ensured nearly continuous availability of the data, application, and system software.
Disk – System Data Storage
The disk is the first real hardware we have looked at in terms of fault tolerance. The disk represents a danger because it offers a single point of failure for the entire system. If a system relies upon a single disk and that disk fails, the entire system fails. Whenever we are considering hardware-level fault tolerance issues, the single point of failure test is almost always an appropriate first step.
As I hinted in the last section, there are many ways to back up data. Our first and best defense is to use a Redundant Array of Independent Disks, RAID. Although this may not be a true technical backup, it gives us immediate access to, in essence, a second copy of our files (read files in this context to include data, system, and application files). Most RAID systems also offer the ability to hot swap (exchange disks without powering down the machine). We will take a closer look at this hardware in Chapter 3, but let me give you a quick overview of the two different levels of RAID that we will consider.
I know that this is not a programming problem, but please take the time to work through this section. The most important thing you will learn about programming in this book concerns the distribution of processes across many machines in an enterprise. The way we will accomplish this distribution of processing across machines is very much the same way that RAID uses to distribute data across several disks. In other words, most of the programming concepts we will cover in this book are modeled after some of the same basic concepts found in a RAID system.
RAID Level 1 – Mirrored Disks
Level 1 RAID requires pairs of disks – 2 disks or 4 or 8 etc. The total storage for all of the disks in the system is only equal to the total storage on half of the disks. This means that if we have two 18 Gig disks, we have 18 Gig of usable storage space available. The other half of the disk space is really used to provide a mirror (hence the name) of the space we actually get to use. The disk controller(s) use hardware to synchronize the data across disks. Due to this hardware intervention, these two disks perform almost the same as if we were writing to a single disk. There is little, if any, degradation in overall performance. (RAID 1 does not offer any speed or throughput improvements over a single SCSI disk.) RAID 1 is designed for fault tolerance. If one disk experiences a failure, the other one can take over immediately without missing a beat. In other words, if RAID were a spare tire, it wouldn't just sit in your trunk. It would be able to sense when a tire went flat and then it would install itself in place of its deflated associate.
In a RAID 1 system, the act of mirroring the data from the primary to the redundant disk is a hardware function that we can essentially consider a black box for the purposes of this discussion.
If we wanted to write the word DOG to a RAID Level 1 disk-array, we would only have access to a single data-transfer stream, which each letter would have to pass through in a serial fashion. That means that we can visualize the process of writing the word DOG as four distinct steps:
Preparation. The data has been passed to the threshold of the disk array.
Write the first data block. In this example, the first block of data is the letter "D". The hardware writes a "D" to the mirror disk.
Write the second data block. In this example, the second block of data is the letter "O". The hardware writes an "O" to the mirror disk.
Write the third data block. In this example, the third block of data is the letter "G". The hardware writes a "G" to the mirror disk.
Of course in real life, these steps would use another method of data block measurement based upon an algorithm programmed into the hardware. A disk array doesn't really work with the data at the level of particular characters. To the array, all the data is simply a series of 1s and 0s. However, I am not a disk array, so you are going to have to bear with me.
RAID Level 5 – Striped Disks
Level 5 RAID requires a minimum of three disks. The total storage for all disks is equal to the sum of the total storage of each disk minus the total storage of a single disk. Oops, I said there wouldn't be any math, but trust me it is a lot easier to get this one if we take a look at some numbers.
For example, if we designed a system with six x 8 Gigabyte disks, we would end up with a total storage of 40 Gigabytes (6 x 8 - 8 = 40).
It is kind of a good thing that we took the little detour down the math road. Did you notice that when we use RAID 5, we end up with more disk space than if we used RAID 1? If we had six x 8 Gigabyte disks in a RAID 1 system we would only have three of those disks available at any one time and 3 x 8 = 24. We got 16 more Gigabytes with the RAID 5 system than we would have gotten if we used a RAID 1 system. And that is not even the cool thing about RAID 5, which is that the data is written across all of the disks simultaneously. RAID 5 uses something called parity to make this work. I am not going to talk about parity bits. Even the thought makes me dizzy, but what it all boils down to is that, somehow, the disks can use this parity bit to reconstruct data in case one of the disks fails. At first blush, this system may not seem to offer any advantages over the simpler Level 1 RAID besides the extra disk space, but it does.
The act of striping data across disks is really an example of a parallel process. When we copy identical data to a mirrored disk like in Level 1 RAID, we have only have access to the maximum throughput of a single disk. When we stripe data across several disks, we have access to the maximum throughput of all of the disks – minus the one we need for parity. Let's look at an example:
In a RAID 5 system, the disk controllers perform the act of determining the value of the parity block. We can essentially ignore this function for the purposes of this discussion.
Now, if we wanted to write the word DOG to a RAID Level 5 disk-array with four disks, we would actually have access to four separate data transfer streams – one for each disk. That means that we can visualize the process of writing the word DOG using only two distinct steps:
Preparation. The data has been passed to the threshold of the disk array. The hardware calculates the parity block and separates the data into four more or less equal blocks of data.
Write the data. Each block of data is written simultaneously to one of the four disks.
Of course, this is an intentional oversimplification of the process. In real life, the last disk would not have two parity blocks. In this example, it would contain something like one quarter of the word DOG and each disk would have a parity block or something like that.
Compare this with the RAID 1 system we looked at earlier. In that case we needed three steps to write the word DOG to the disks. With RAID 5, we only needed 2. This is an example of the power of parallel processing. Actually I guess that this is a case of parallel disk writing, but you get the idea. If we can take a big task and break it down into smaller tasks that can be executed in parallel, we will get finished sooner. And I bet you thought this RAID stuff was going to be hard. I am not sure exactly how RAID 5 measures up in terms of spare tires and engines, but I think we might be able to compare it to rotating the tires. Remember? When we rotated our tires, we took the spare out of the trunk and replaced one of the active tires with the spare. This made all of the tires, including the spare, wear more evenly.
I guess you could say that I've balanced the load that the tires had to bear across all 5 tires rather than letting the 4 active tires wear out sooner. It is kind of like that with RAID 5. Rather than writing all of the data to one disk, we are sharing the load across multiple disks.
Before we go on, let me give you a visual concerning RAID 5 disk failures. I don't know if it helps you, but a lot of times it is easier for me to get a handle on something if I can picture it. With RAID 5, if one of the disks fails, the remaining disks use the parity data on the remaining disks to reconstruct the information that was stored on the failed disk. In the example above, if the disk containing the letter "G" failed. The controller would use the parity data stored from each of the remaining disks to deduce that the letter "G was actually stored on the failed disk. Then it would store that information using the same technique across each of the remaining disks. The results of a failure might look like the following:
Once again these are illustrative examples, the RAID controller does not treat the data as anything like recognizable characters – just binary data.
Perhaps the most important thing we can learn from these two RAID examples is that it is possible to design systems that can improve both fault tolerance and performance. Just remember that if we use RAID 1, we can expect improvements in fault tolerance. What we have done is to provide the system with essentially real-time data backup/restoration for system failures involving a system disk. The cost for this protection is that we must make one of our resources (the redundant disk) unavailable for storage space in anticipation of failure. With RAID 1 only half of the functioning resources are available to us at any time.
RAID 5 offers fundamentally the same level of fault tolerance with respect to disk integrity. But it also improves overall system performance. It does this by sharing the total system load across all the available resources – both primary and redundant. With RAID 5, all of the functioning resources are on-line all of the time.
The basic difference between the two configurations is that RAID 1 is essentially serial in nature while RAID 5 operates in a more parallel fashion.
Generally, I would say that next to a careful backup plan, the best step to take towards system fault tolerance is the addition of a RAID storage area. RAID has the effect of providing real-time data backup/restoration in the case of disk failures. As an added bonus when RAID Level 5 is employed, our efforts to increase fault tolerance also return improvements in overall system performance as well.
When I think about hardware and fault tolerance at the same time, I find that I only really need to ask one question. If this piece of hardware fails, will the system go down? In other words, in this case we need to ask the question of whether or not the power source is a single point of failure for the system. Let me think now, what happens when I pull out the plug?
Of course, the addition of a fault tolerant power source is another one of those no-brainers. Even most home computers are equipped with uninterruptible power supplies (UPS), which allow the user to continue working for a short time in the event of a power outage. It has been my experience that, the best possible use for these inexpensive home devices is that it allows the user to power down the computer in a controlled fashion. Most of them do not have the ability to sustain any real use for even a short period of time. Enterprise Caliber UPSs are another story. They should at minimum allow for a controlled power down, but it is not unreasonable for these devices to be used as power sources that ensure continuous availability of service. In most true enterprise installations, the UPS devices are used as bridge power sources. They ensure that the system remains available during the amount of time required for the backup generators to light up and take the place of the conventional power source.
The power supply for most computers is a single transformer unit. By now, when you hear the word "single" with respect to hardware, you should be wondering what will happen if the thing breaks. If I only have one power supply for a computer and it breaks will the computer continue to work? No! That means that the power supply (transformer unit) is a single point of failure for the system.
This is different than a power source. A power supply is the physical transformer unit that modifies, splits and transfers the power from the power source (electrical outlet) to the chips in the computer. If a power supply fails, the computer will shut down in an uncontrolled fashion, regardless of whether or not a UPS is present. An Enterprise Caliber Server should have a continuously available, redundant power supply with automatic fail over. If the first power supply fails, then the second one can immediately take over for the first power supply. Ideally, these power supplies should also be hot swappable. In other words, we should be able to treat the power supplies about the same way we treat the disks in a RAID system. If one breaks, we should be able to rip it out and replace it without shutting the machine down.
The network connection is a single point of failure for a single computer. I know that if we followed the same reasoning we used earlier, we might come to the conclusion that we needed a redundant network connection for every computer. In this case, we need to think about the problem a little more. Remember when we talked about the automobile's engine? The engine was a single point of failure for an automobile, but most people don't have spare engines the way that most people have spare tires. That is because it is very expensive to keep a spare engine considering the remote possibility that you will ever need it. In other words, we need to balance the cost of the resource with the cost of the failure of the resource.
In other words when we are designing a network for an enterprise, it is not unreasonable for every workstation (client) in the network to have access to just a single network connection. After all, if the connection between the user and the hub or switch breaks, the most our organization stands to lose is the work that is normally accomplished on that particular client machine:
So, when we are talking about client workstations, a redundant network connection is probably not a worthwhile investment.
But, if we look at the network from a larger perspective – considering the connections that link hubs and switches, then redundant network connections might begin to make sense. We can call this type of network connection a backbone connection:
If we lose a non-redundant backbone connection, then we have lost all of the workstations (clients) connected to that particular hub. Most organizations find it economically feasible to protect against the loss of a backbone connection.
Take a minute to look at the image above. Notice that if any backbone connection is broken, the entire system can still continue to function. Most organizations have networks with this level of redundancy. They ensure a high degree of availability at a reasonable cost.
There is one other place where we need to consider redundant network connections. That place is at the level of mission-critical servers. Just so we can talk, let's develop a working definition for a mission-critical server – we'll just say that a mission-critical server is any server that is required for X numbers of users to perform their assigned duty.
Now, although these resources may be single computers, if one of these computers is unavailable due to a faulty network connection, then we have put X number of employees out of work until that network connection is returned to service. When we are talking about mission-critical resources, even if that resource is a single server, then we should consider providing a redundant network connection to this resource:
Providing an additional network connection to enterprise resources is a relatively inexpensive way to ensure a high level of availability.
The network card is a single point of failure for the system. Of course, we should think about this problem just about the same way we worked through the network connection problem earlier. In this case, it is not so much a question of whether or not the hardware device is a single point of failure, as it is a question of how much will it cost if the hardware breaks? And, again we probably can make a rule that if the resource is a mission-critical resource (more than X users depend upon it to work) then it probably needs to be redundant. If it isn't a mission-critical resource, then it probably doesn't need to be redundant.
Although it is a good practice to have redundant, distinct network connections to a mission-critical enterprise resource, it is imperative that we have redundant network cards installed in every mission-critical server. Of course, this is not as ideal as having both redundant network connections and redundant network cards, but it does offer a higher degree of fault tolerance at a fairly small cost:
Generally, rather than considering each of the individual chips on the motherboard or other electronic component, we consider that the component and the chips attached to it form a single point of failure for the system. By now, we would probably approach the redundancy question from the same common sense perspective, so I won't test your patience by going over it again. That's ok. Because in this case, once we determine that we need a redundant resource, we have another complex problem to solve.
The only real way to provide real-time redundancy at this level is to have a second stand-by resource. If your organization can stand a couple of hours of downtime, you might be able to get away with having a well-stocked parts department and a team of highly trained technicians. But there are many, relatively inexpensive, ways of adding even this high degree of fault tolerance to an enterprise. When we begin to add redundant resources as all-encompassing as entire servers to an enterprise, we begin working with something called a cluster of resources. This is such an important idea that we will talk about it in its own chapter, but for now, let's compare it to the RAID systems we talked about earlier.
With the RAID 1 system, we have essentially two resources available – a primary disk and a redundant disk. Remember that with the RAID 1 system, only one of the resources can be available for storage space at any one time. The redundant disk waits ready to take over in case of primary disk failure. We can do almost exactly the same thing with entire servers. We can create a redundant (mirrored) server that stands by passively waiting for the primary server to fail. When the primary server fails, the redundant (mirrored) server goes on-line and takes the place of the failed primary server. With RAID systems we call this disk configuration RAID 1, when we are working with entire servers, we call this server configuration Active - Passive. Like RAID 1, this configuration is essentially serial in nature.
Remember that with the RAID 5 system, we have a number of disks available and all of the functioning disks are on-line all of the time. All of the disks in a RAID 5 system share the entire load placed upon the system more or less equally. Of course, we can do almost exactly the same thing with entire servers. We can create an array of servers that work in parallel to handle the entire load placed upon the system. If one of the servers fails, that server is taken off-line and the entire load is redistributed more or less equally among the remaining servers. With RAID systems we called this disk configuration RAID 5, when we are working with entire servers, we can call this server configuration Active – Active. Like RAID 5, this configuration is essentially parallel in nature.
We use the words fail over to describe the steps that must be taken to allow the redundant resource to replace the primary resource in the system. This process can take many forms. At one level the process of restoring backed-up files is an example of a manual fail over process. If we examine the steps in detail, they probably look something like this:
Preparation for failure. Perform regular backups, which create/refresh the redundant resource.
At time of failure:
repair the condition that caused the failure of the original resource.
restore the files on the original resource to their last backed-up state.
As you might imagine, this process takes some time. Let's call the time this process takes the recovery time.
Of course, there is an alternative to manual fail over processes. An example is the type of fail over offered by the RAID system we talked about earlier. With a RAID system, the fail over process is automated. Whenever any disk in the array fails, the remaining disk(s) automatically assume the responsibilities of the failed resource. To the casual user, the transfer of control from one disk to the other(s) appears instantaneous. But upon closer examination, we find that there is, really, a recovery period. In well-designed RAID systems, the data flow during the recovery period is cached in local memory storage on the disk controller. The local memory storage serves to bridge the processing gap during the recovery period. This bridge is what gives the illusion of continuous operation and instantaneous fail over. Automated fail over processes are not limited to RAID controllers, we utilize automated fail over techniques in all aspects of enterprise design – including clustered servers.
Of course, the primary difference between manual and automated fail over is the length of the recovery time. In general, automated fail over processes are faster than comparable manual processes. But, it can still take several minutes for, say, a redundant server to replace its primary counterpart. One of the most interesting design problems we will work through in this book is the management of data flow during this recovery period. In a well-designed enterprise, the entire system should be able to give the illusion of continuous operation and instantaneous fail over in all but the most catastrophic circumstances.
Fault Tolerance and Load Balancing
As we saw with the Level 5 RAID system, one of the fringe benefits we can get from designing a fault tolerant system is the ability to use the redundant resources to share the load on the system. Let's go back to the spare tire example for a minute. Remember when we talked about the process of rotating tires. One of the things we did in that process was to move the spare tire from the trunk and place it on one of the axles. This allowed us to share the wear and tear more evenly among all 5 tires. The result of this process is to extend the usable life of all 5 tires – including the spare. Our fault tolerant resource, the spare tire, handled a more or less equal portion of the entire load when compared to the rest of the tires.
Let's compare this with RAID systems we talked about earlier. With the RAID 1 system, we have a primary resource and a duplicate resource – the primary disk and the mirrored disk. We utilized these resources in a serial fashion. We write to the primary disk, and through some hardware magic what we wrote on the primary disk is copied to the mirrored disk, in what appears to be real-time. In this scenario, the only benefit we get from the duplicate resource (mirrored disk) is a real-time backup/restoration for disk failures. The duplicate resource does nothing to assist the primary resource with its everyday tasks. With RAID 1 there is no load balancing, no parallel processing, and no improvements in scalability. RAID 1 is serial in nature.
Next, let's consider the RAID 5 system. With the RAID 5 system, all of the disks in the array are all available as a primary resource (all disks are also available for redundant or backup duty as well). When we write to the disk array, we are in essence writing a little bit of data to each of the disks in parallel. We have already discussed how this increases the available bandwidth (throughput) for the entire array. With the RAID 5 model, we are utilizing parallel processing. This distributes the load across all of the available resources. RAID 5 systems are more scalable than RAID 1 systems, because RAID 5 is parallel in nature. That brings us to the next broad design stroke we need to consider when we are delivering an enterprise – scalability.
Let's get this one straight at the start.
Scalable means that a system is capable of growth. It does not mean that your enterprise must be capable of servicing the needs of tens of thousands of users the first day you turn the key. It does mean that the system should be capable of growing large enough to meet needs of any future number of users.
Historically, scalability has been the place where most PC-based development has fallen short. This was in part due to the general lack of processing power that older xx86 processor offered. But, it probably had more to do with the operating system's inability to perform real multi-user/multi-tasking processing. There is both good news and bad news on this front. The good news is that these hardware/OS shortcomings have been all but eliminated. The bad news is that we face another, more deeply entrenched barrier to scalability – antiquated development/deployment practices. These antiquated practices are the expected progeny of a system where multi-user/multi-tasking processing capability did not exist. Unfortunately, it is a lot harder to change old habits than it is to replace a motherboard or upgrade an operating system.
Developers and system integrators have grown accustomed to viewing the PC as an individual resource. This is a mistake. The real power of this class of machine is that there are hordes of them. With a little effort and a maybe a lot of imagination we can combine their capacities into a virtually unlimited amount of processing power. When we use multiple machines to handle processing, we can increase fault tolerance and optimize the performance of the entire system through load balancing.
In this chapter, we will take a look load balancing as well as a number of other techniques you can use to ensure that your enterprise caliber system is capable of growing to meet your organization's future needs. The first step in that direction is to be able to identify and separate the processing requirements.
Major Process Isolation
One of the first things we learn in engineering is that in order to solve any problem we must first define the problem. The second thing we learn is that the better we define the problem, the closer we are to the solution. For a long time now, there has been a huge, virtually undefined problem with most PC-based information system installations – the lack of major process isolation. Let's take at look now at the two major processes that should be separated.
OLTP – On Line Transaction Processing
Over the past 20 years or so, there have been great advances in the capacity of information processing systems. While it is true that we can attribute a lot of the improvement to better hardware and operating systems, there has been another quieter evolution going on. This evolution is responsible for the current techniques we use for storing and working with information at the database level. Over time, we have gradually reached the point where we have driven most of the redundancy out of our well-designed database systems. As an industry we have learned to design and optimize our systems around a unit of work we call a transaction. A transaction is one or more actions that are either performed as a whole or none of them are performed at all. The most optimal transaction uses the smallest possible data set that will get a particular task accomplished. This makes sense. At its core, every computer action manipulates a set of binary data. The smaller that set of binary data is, the faster we can process it. This type of processing is called On-Line-Transaction-Processing (OLTP).
Of course, nothing is free. The problem with OLTP is hidden in one of the words we used to describe the process, that word is optimize. When we optimize any system to specialize in a particular task, we almost without exception, lessen that system's ability to perform different types of tasks. Think of a giraffe; this animal is optimized for grazing at the treetops. However, if a giraffe had to exist in grassland, devoid of trees, I am sure it would consider the experience nothing more than a big pain in the neck. Systems optimized to handle OLTP have essentially the same problem. They are finely tuned systems that thrive in a carefully balanced universe. If we disrupt the harmony of an optimized system in any way, we do more damage than we would if we did exactly the same thing to a non-optimized system. Now, the easiest way to throw an OLTP system off kilter is to ask it to process a huge multi-row, multi-table query. Remember that this system is optimized to work with the smallest possible data set. OLTP systems are not designed to handle the type of questions our users typically need to ask. Does this mean that we need to abandon everything we have learned about OLTP in the last 20 years? No!
What it means is that we have to learn to accept the OLTP system for what it is. It is as perfect as a giraffe. It offers the finest model we have for working with an incredibly huge number of very tiny, generally repetitive transactions, such as checking the balance of a bank account. What this means to the enterprise, is that we can use OLTP to handle approximately half of the work it will typically be asked to perform. What we need, now, is an equally optimized technique for handling the other half of the work.
DSS/OLAP – Decision Support Systems/On Line Analytical Processing
We call that other system, the Decision Support Systems (DSS). There is also another phrase that people use to describe Decision Support System these days - On-Line-Analytical-Processing (OLAP) systems. I will use the two phrases interchangeably throughout this book. I have a feeling that the industry will move in the direction of accepting the OLAP moniker, but there are an awful lot of Decision Support System teams out there, so you never know.
Anyway, while the OLTP system is best characterized by its lack of redundancy, the DSS system revels in redundancy. We won't go into all of details here, but suffice it to say that in an DSS system, it wouldn't be unusual to find a particular city name stored repeatedly 100,000 or more times. While in a well-designed OLTP system, I would be surprised to see any city name stored more than once – period.
Of course, the DSS system is not designed for real-time transaction processing; it is really something of a data warehouse, i.e. it is optimized in a manner that makes it easy to get answers to questions, even complex enterprise-wide ones, in a very short time. This gives our users the type of tool they can use to find out how many Snickers candy bars were purchased from a particular store between the hours of 3:30 pm and 5:00 pm on the 5th of June 1998. The way the DSS is optimized, they could also find out how many Snickers candy bars were sold throughout the world during the same time frame with a couple of mouse clicks. In this case, we have optimized the system to make it easy to query. We do this at the cost of storage space, processing time, response time, and more. But all that really means is that this portion of our system has something of a different personality.
We expect the OLTP portion of the system to respond immediately to all requests, updates, etc. This system is optimized (at the database level) for speed first and then ease of use when dealing with single items (one employee, one customer, one purchase etc.). OLTP is a speed demon!
On the other hand, we expect the DSS portion of the system to provide us with easy access to answers for difficult (maybe formerly impossible) questions that span multiple (enterprise wide, all the employees, all the customers in France, etc.) items. This system is optimized (at the database level) for ease of use first and then speed when dealing with multiple items. DSS is easy to query.
Please note that when I refer to speed or ease of use, I am considering the system from a programmer's perspective. From the casual user's perspective, both systems should be easy to use and provide reasonable response times. Most users do not have trouble facing the reality that it takes slightly longer retrieve answers to enterprise-wide questions like – How many Snickers bars were sold in Texas during the month of June? - than it does to retrieve other answers like – What is the suggested retail price for a Snickers bar in Texas? Users understand this; it is common sense. But many IT departments operate as though these two questions are fundamentally the same. They are not. They utilize system resources differently. They must be handled separately.
Whenever I hear the word scalable or scalability in an enterprise design context, I immediately substitute the phrase parallel processing. The best way to ensure that a system is scalable (capable of growth) is to design the capacity for parallel processing into the system from the start. Parallel processing may be a foreign concept for most developers, but fear not; we have already taken a couple of huge steps towards understanding parallel processing while we looked at major process divisions and fault tolerance.
It is not difficult to understand the increases in fault tolerance and performance that parallel processing offers. But there are a number of ways to design a system that utilizes parallel processing. We need a little more information before we start making design decisions concerning parallel processing. In one of my Chemical Engineering classes, I had a Process Design professor that used to make the biggest deal out of continuums. It drove me crazy. In every one of his classes, he had to waste at least 5 precious minutes of my life talking about the difference between discrete and continuous values. I always wanted to scream, "WE GET IT!!! NOW SHUT UP AND GET ON WITH THE CLASS." I never yelled, and to this day I still don't understand his fixation with the subject, but I am going to talk about it anyway. Please bear with me.
Perhaps due to my naturally binary tendencies, I have come to realize that my professor had something of a point. While it is true that everything is not black and white, we can choose to define some gray color as the black and white boundary. In other words, if we were working with a continuous range of values of 1 to 100 representing white to black, we could set the black and white boundary at 50 and say that everything greater than 50 is a black number and everything below 50 is a white number. Therefore 14 1/3 would be a white color and 97.0004 is a black color.
This kind of generalization makes dealing with the continuous process much simpler. What we need to do in this section of the chapter is to define something of an informal parallel processing/serial-processing boundary. In other words, when we say that a system performs parallel processing, we should really say that the system kind of has parallel processing capabilities. Think about it; all computers perform some tasks in parallel. Parallelism is a kind of continuum; there are many degrees of parallelism that a system can exhibit. For instance, these days most enterprise caliber servers and many workstations have multiple processors.
Multiple Processor Servers
Multiple processor machines are everywhere these days. These give our system parallel processing capabilities, don't they? Well, yes and no. Machines with multiple processors do process some tasks in parallel. They do perform load balancing, and it has been my experience that they do this very well. However, I don't think we can truly say that a system made up of computers with multiple processors is a system with true parallel processing capabilities.
What machines with multiple processors do is to allow us to utilize more of the resources of the machine in question. Say that we have a machine with a system (bus disk IO etc) that can transfer 100 units of data per second. Let's also say that each CPU on the machine is capable of processing 25 units of data per second. If this machine has only 1 CPU, then at best we can expect 25% utilization of the system's resources. If we add a second CPU, then we might expect to utilize 50% of the system's resources. I won't step through every possibility, but consider what happens when we add the 5th CPU. Can we transfer 125 units of data through the system? No! Is this computer scalable? Well, yes and no. If we purchased this box with 1 or 2 processors, then we can expect it to grow until it has a total of 4 processors. If we purchased this box with 4 processors, then it cannot continue to grow. In short, there is a real limit to the amount of processing that we can perform with a single machine. Most manufacturers have found that the real improvements in performance begin to decrease after 4 processors. And with present technology, anything more than 8 processors in a single computer does not significantly improve the computer's performance. What multiple CPU machines do is to move the limit from the processor to another limit in the same box. The bottom line is that there is a limit. That means that for a truly scalable (parallel) system we need another answer.
It almost goes without saying that, we should try to maximize the performance of every piece of our enterprise. So, depending upon the performance requirements of a particular component (read component in this context as one of the servers in our enterprise), multiple processors are an excellent way to ensure high utilization of the component's resources. But if we want to design a scalable system with true parallel processing capabilities, we need to do more than just add processors to individual servers. We need to add more servers.
If we can dedicate multiple machines to a particular task, then we have increased the fault tolerance of the system. If one of the machines or any part of that machine fails, we should be able to replace it with the other machine(s) designed to perform that task. The question is how do we best utilize the redundant resources.
We could do that by having a stand-by server for each server in our enterprise. This would give us something like the RAID 1 system – Active-Passive. It would increase the fault tolerance of our system, but it wouldn't do anything to improve the system's overall performance characteristics. It would also mean that we have at best only 50% of our enterprise's resources available at any one time.
There is another possibility. We can model our multiple server solution after the RAID 5 system instead of the RAID 1 system. This would give us an Active-Active server configuration. If we do that, then every server in the enterprise including the redundant servers, will be available all of the time unless one of them fails. We can then plan to utilize their resources in parallel, nearly the same way we utilize the RAID 5 system's disks. This places 100% of our enterprise's resources on-line rather than the 50% a serial solution would offer.
This is the point on the continuum where we will draw the boundary line where we consider that the system is capable of true parallel processing – when we have multiple servers dedicated to a single processing task available in an Active-Active configuration. Now that we have a common point of reference, let's take another look at process division.
Major Process Load Balancing
The first step in load balancing is to segregate the major processes that utilize the servers' resources in wildly different manners. For any but the smallest of installations, this means that we need an entirely separate database systems for OLTP and DSS portions of our enterprise (read system in this context as every piece of hardware/OS/software required to move data from the user to the storage area and back). In some installations this may mean a single server for OLTP portion and a second for DSS portion, in others it may mean tens of servers for each portion. The bottom line is that every enterprise is comprised of at least two physical systems – the OLTP and the DSS systems:
Minor Process Load Balancing
The next step in load balancing also involves processes. In this case, we are attempting to balance a finer grain of processing. To avoid any confusion let me say at the outset, that we must perform an identical minor process analysis on both the OLTP and DSS portions of the enterprise. In this section, we will walk through each of the minor processes. We will start with the processes that occur closest to the data storage area and work outward through each of the other processes we must perform to either deliver data to or receive data from the end user. Basically, this minor processing includes four kinds of processing:
Data/Business rule integration
Data Storage Processes
The processes that are used to store data are quite different from the other process in our enterprise. Take a second to think about the physical/mechanical actions that must take place when we work with physical data storage.
If we design our system correctly, this may be one of the few places where the system is actually performing reads and writes to an actual physical disk. If we compare this type of activity with another type of activity, say working with a dynamic link library that has been previously loaded into memory, we see that the two activities are so dissimilar that we might spec out entirely different servers to perform each action.
In other words, for a server that spends most of its time reading from and writing to physical disks, like a typical relational database, we would be wise to spend the lion's share of our equipment budget on the physical disk subsystem.
We can improve overall system performance if we design this server to handle the specialized task of performing physical disk reads and writes. Of course, this means that we are building something as specialized as a giraffe. When we tune this machine's performance characteristics for improved disk IO, we are probably making it somewhat inefficient for another task. Is this specialization the best course of action? It depends; in a pinch you can drive in a screw with a hammer, and I guess that you can pound in a nail with a screwdriver. But I don't know too many professional carpenters that would advocate either practice.
I always liken the data storage area of an enterprise system to a precious gem, a diamond. The bottom line of most computer activity is to either place information into the system or retrieve information from the system. Information is the most precious commodity that an organization can possess. Even if your organization mines diamonds for a living, it won't profit one whit unless it knows where the diamonds are located and can deploy its assets in an economically feasible manner to retrieve the gems. Information is what makes this possible. The organization's information is the data store.
Please don't take this to mean that all of the information for an organization should be stored in a single server or bank of clustered servers. This would make programming simple and life wonderful, but it ignores reality. In every organization, the data is typically stored in everything from mainframes to Excel spreadsheets. What we need to do is to strive to design a system that allows us to work with our organization's data as though it did exist on a single machine. When we make decisions about how to distribute processing across the available infrastructure, we must always remember that the goal of distributed architecture is to distribute the processing – not to distribute the data. In this book, we will focus on the data store server as a controllable device – a relational database, but the overall design of the infrastructure must give us the ability to point the entire system at a different data store without skipping a beat.
What this really means is that we have identified a second set of processes that must exist close to our data store. These processes must be able to wrap around our enterprise's data store and make it appear to be a single source. Let's call these processes data manipulation processes.
Data Manipulation Processes
Data manipulation processes are designed to know where an organization's data is stored and know the steps that must be taken to retrieve, remove, or change that data. These processes do not typically perform tasks like disk IO themselves. They usually invoke other processes, most of them on another physical machine, to handle the physical writes and reads. As I said above, the data manipulation processes are designed to wrap around the core of data available to our organization and give the programmers, and the users, access to the entirety of that data as though the data existed on a single machine. In reality, that data may exist on a centralized SQL Server machine, but it may also exist in legacy machines like mainframes, tired old UNIX systems, in flat files, and maybe even somewhere out on the Internet.
Once the data manipulation processes have located the correct storage area and have executed the proper commands required to invoke those storage process, they have one additional responsibility. They must package/unpackage (marshal) the data for transport either to or from the data storage process it has identified for the particular data set it is working with.
In many ways, it is like the data manipulation processes replace traditional database tables and flat files from the programmer's perspective. We use these specialized processes to encapsulate or hide the sometimes messy things we have to do to work with these devices – database tables, flat files, a web site on the Internet, etc. We create these specialized processes, in part, so that we can solve the data access problem once and be done with it. But we have also done something else – something far more important. We have given ourselves a way to separate out one of the processes our system needs to perform. This process is now freed from the monolith. We can work with it as a separate entity. We can move it from one machine to another, we can run the process simultaneously on several machines if we want to or need to. If we find that this particular process demands a little extra from our system, we can address its needs directly without having to make changes to every other portion of the system.
In this case, we have learned that we can think of data manipulation processes as a set of functions that cover an organization's data. They are intended give us a kind of intelligence about the underlying data that would be difficult to come by otherwise. Rather than demanding that every developer and user understand the curious and perhaps confusing design of the underlying data store, these functions are designed to provide access to that data without regard to its particular physical storage context. When you spend a lot of time identifying and working with different sets of processes, you come to find that processes have personalities or characteristics. In general, the data manipulation processes are essentially functions. Their personality is strictly business – they execute a task and return a value. They perform that task as quickly as and efficiently as possible and do not hang around for long periods of time.
It is important to segregate the data manipulation processes from the data storage processes. While the physical act of storing data is a disk-intensive process, the data manipulation processes rarely perform any disk IO. A well-designed data manipulation server should spend most of its time making calls to a series of dynamic link libraries. Ideally, these dynamic link libraries should already be loaded into fast random access memory (RAM). The disk IO in this process should be only that amount required by the operating system for memory swaps. Of course, there should be enough system memory and raw processing power to ensure that the disk-based memory swaps are as infrequent as possible.
The next set of minor processes we must consider are the processes that integrate the data we have stored with the business rules our organization has chosen to apply to the data. We can call that set of processes the data/business rule integration processes.
Data/Business Rule Integration Processes
This set of processes is responsible for integrating the data the organization stores with the organization's business savvy or talent. These processes are kind of like the work a master carpenter performs when he turns six oak boards into a piece of fine furniture. For a retail company, the organization's talent (business rule integration processes) may be the carefully crafted functions that set the retail price of an item at exactly the point that maximizes sales volume while simultaneously maximizing profits.
At first, glance, the way these processes execute physically is nearly identical to the data manipulation processes we just considered. But if you look close enough you will find that there are two real differences in the way these processes perform their task.
The first difference between the two is what the processes know. While data manipulation processes know about the organization's data, the data/business rule integration processes know about the organization's talent. Their job is to pull together data and business rules, and by combining the two, somehow increase the value of both. Again, if data were wood, the business rules would be the carpenter's talent or skill. This difference doesn't seem to change our processing requirements at all. By the time these processes see the data, it is delivered in memory (or at least available through some cache/virtual memory facility). We do not have any disk IO issues to contend with here. Why do these processes deserve a different category if they are so similar to the data manipulation processes?
I am sure many of you are prepared to argue that both types of processes use a series of dynamic link libraries. In both cases, the dynamic link libraries should be loaded into the host machine's random access memory for optimal performance. The division between these two sets of processes is really more of a logical (theoretical) difference than a physical difference. So, unless the physical proximity between the data store and the presentation sphere dictates it, there is no reason to separate these processes onto different machines.
Did I mention the second difference? No? Oops!
The second difference between these two sets of processes is found in their personality. The data manipulation processes are just-the-facts type of guys. You tell them what you need. They deliver it. They shut down. (Well actually, they cycle, but that is another story.) The data/business rule integration processes have a different personality. They might hang around for a while. Instead of being like a single function, they offer an assortment of goodies that a programmer might need to use for a while. So, although they are functionally (from a purely technical point of view) identical, they do actually take up the server's resources quite differently. To sum it up, the second difference between the two types of processes is the amount of time each one hangs around before they shut down. Generally, the data manipulation processes execute quickly and shut down, while the data/business rule integration processes tend to stick around in memory for a while.
If we do our jobs correctly, even this time difference will be measured in milliseconds, but those little guys add up. Our purpose here is to identify the difference so that we can choose to do something to manage it in our enterprise's design.
Now that we have identified this set of processes, we can peel them away from the monolith that most developers see as an application. As with the data manipulation processes we discussed earlier, once we have separated them, we can address their needs specifically. In this case, we have a set of processes that generally take longer to complete than their related data manipulation processes. When these two sets of processes had to be treated as a single unit, we could not do anything to address this difference. Now that we have separated them, we can immediately see one way to make our system more scalable. Because these processes take longer to execute, we should instantiate more of these processes than the number of data manipulation processes that are running at any one time. Think about this carefully. This simple notion is one of the most powerful allies we have to make our systems more scalable. If a process takes longer to complete, then we can even up the processing load across an entire system by running more instances of the slower process than instances of the faster process. I learned this concept as something called the critical path method. All it really means is that we can improve system performance by identifying potential bottlenecks and doing something to cure the bottleneck. When we treat an entire system or application as a single set of processes, we have given up the ability to tune the system at this level. If we don't separate out the data manipulation processes from the data/business rule integration processes, then each data manipulation process will require the same amount of time from our system as the longer business rule / data integration process that spawned it.
If you think about it for a minute, you might come up with another place where we could do some tuning. Just like this process spawned a data manipulation process, there is another process that spawns the data/business rule integration process – exactly, the presentation process.
Although it may not be too intuitive, the presentation set of processes is more like the data storage set of processes than the two other intermediate processes. Think about it.
The GUI's job is to present or receive data to or from the end user. The real difference between the data storage and presentation set of processes is where we write the end result of the process. While the data storage processes primarily write to the system's disks, the presentation processes primarily write to the end users' workstation screens, files, and printers. Nevertheless, both sets of processes are concerned with changing fleeting binary data transmissions into something of a more persistent nature.
The most important thing to realize about presentation processes is that they are by definition the slowest processes in our system. They must wait until the other processes have completed before they can begin to perform their task. If you are thinking that we have identified another bottleneck in the critical path of our system, I agree with you completely. When we think about the impact of the presentation processes on our system, we need to consider where these processes really execute. If they are really executing on the clients' desktops, as long as we separate them from the other processes we have done our jobs of distributing the processing load across the entire system. But, if they execute on servers, we might need to give these guys more weight in our design considerations. In the case of the application we will be focusing on in this book, we will be using Internet Information Server to handle the presentation processes.
If we utilize an Internet Information Server (IIS) machine in our enterprise, then we are in fact performing a lot of the GUI programming at the level of the server(s). True, the IIS machine does not perform the actual screen writes, but when someone asks it a question, it must look to the disk to find the correct page, read it into memory, and execute the instructions on the page.
Notice the disk IO issue as with the data storage process. If we do our jobs well, we should be able to limit this activity, but it is important to consider it in our design analysis.
With Active Server Pages (ASP), the instructions that the server executes at the server are, in large part, designed to cause the data to be placed on the end user's screen in a specific way. That is a lot of server-side processing dedicated to the user interface. So I think it is reasonable to place the processes that define the look of the user's screen (even if they execute on a server) in the same bunch of presentation processes as the actual screen write processes. If we add this set of processing responsibilities to the ones we covered earlier it means that, in total, we have identified four minor processing divisions that exist within the two major processing divisions we have for every enterprise system.
Now that we have taken the time to identify and break out the different processing requirements, we have gained the ability to tune our system at the level of the individual processes. This means that rather than solving every processing shortfall at the level of an entire application, we can focus our attention on the actual source of the shortfall. This ability to add resources where they are needed will allow us to design an efficient physical system that can grow to meet the needs of any number of users. In the next chapter, we will see how we can mirror the concept of logical processing divisions across a physical system composed of many servers. We will discover that this division of processes is really a lot more than just a thought problem. We will find that the steps that we have taken to divide the processes will translate into a set of components that can be distributed across as many machines as we need to handle any number of users. The ability of a system to grow to meet the needs of any number of users is the definition of a scalable system.
So far, we have covered the topics of availability and scalability. I am sure that in many readers' minds that means that we haven't even considered the third broad design stroke of security yet. I am afraid that I would have to disagree. As far as I am concerned, security is a lot more than something we can simply drape over an existing system. Security must be designed into even the tiniest element of the system at the time that element is designed.
Personally, I have a little trouble with this one. When most people think about enterprise security, their mind immediately turns to external threats. They conjure up images of some caffeine-swilling outcast spending untold hours trying to break into their system. And many really believe (perhaps with some level of justification) that their competitors have hired industrial espionage experts to break into their system and steal their company's secret designs for men's mini-skirts or something. Sadly, the same decision-makers that will cheerfully spend sinful amounts of money battling these unlikely windmills cringe at the thought of spending 1/1,000th the amount on a something as basic as a redundant server. When we say that an enterprise is secure it means a lot more than when we say that the enterprise is safe from some external threat. In real life, it doesn't matter whether the rat that brought the system down was some cracker spoofing an IP address or an actual rodent that chewed through a non-redundant power cord. In either case, the result is the same. The system is no longer available to the users.
Does this mean that we can ignore the crackers, crooks, and spies? Of course not! These can be real threats. They should be given their due. But, we do need to stop thinking about security threats solely as external menaces that break into our system from the outside. The number of real threats any enterprise system faces is huge. What we need to do is to modify our definition of a secure system. When we say that an enterprise is secure, what we should be saying is that the authorized users of that system and the organization that owns it can count on the system to protect their interests.
Let's play a little game. Imagine for a moment that we could design and implement a system that was 100% safe from external threats. Can we now say that this system is secure? Does it protect the organization's interests? Let's throw a couple of thought problems at it and see how this perfect security system fairs.
Problem: A real rat decides to commit suicide by chewing through a non-redundant power cable to a non-redundant server that houses all of the organization's data.
Outcome: The system is down. The external threat security system did not protect the organization's interests.
Problem: The company's new marketing program was a stellar success. The system responsible for taking and filling the orders cannot keep up with the demand. Some orders are not being filled.
Outcome: The organization has lost the profits from the orders that it failed to fill. The external threat security system did not protect the organization's interests.
Problem: Stan in accounting decides that he will use his security privileges to transfer some of the money from the organization's savings account into his own savings account.
Outcome: This threat comes from within. The external threat security system did not protect the organization's interests.
Problem: Crumb, the criminal cracker spends 1-man year attempting to breach the enterprise's new security system.
Outcome: Crumb is foiled. The external threat security system protected the organization's interests.
Calm down! I know that we don't typically view all of these things as security threats. That's unfortunate. I would love to do a study comparing the real cost of the first three items against the fourth item on the list. I am sure that I would find that most organizations do a pretty good job of shooting themselves in the foot without any outside help. You don't have to tell anyone else, but I want you to take a minute or two, look to your own experience, and add up all the time you have spent doing battle with ingenious crackers. Done? OK, now add up all the time you have spent doing battle with idiotic internal threats. Enough said?
When I consider the things that really pose a threat to an enterprise's users' or owners' interests, I usually end up with a roadmap that looks something like this:
Internal access control
External access control
Throughout this book, we will revisit these four subjects time and time again. They are the cornerstones upon which we will construct a secure enterprise system. We have already taken a look at the first two concepts earlier in this chapter, so let's concentrate on last two - internal access and external access - for the remainder of this chapter.
Security means Controlling Access to Resources
Windows NT has achieved a C2 security rating from the US Department of Defense (DoD). That sounds pretty impressive, but what does it mean to a real enterprise installation. It means that the Windows NT operating system has the potential to be the basis for an extremely secure enterprise.
One of my great joys in life is creating art. So to indulge myself, every once in a while, I splurge on some ridiculously expensive graphics software. One time, a friend of mine asked me what I could do with the high-end graphics application I had just purchased. The question took me by surprise. At first I thought she was kidding, but I could see by her expression that she really wanted to know. So, I thought about it for a second or two before I gave her my reply. Then I asked her this question. "What can a carpenter do with a load of wood?" She paused, smiled knowingly and said that she "guessed it depended upon the carpenter."
Windows NT manages security by controlling access to objects. To Windows NT, everything is an object – files are objects, users are objects, printers are objects, and so on. When an authorized user logs onto NT, the OS creates a unique instance of a user object to represent the physical user, and assigns that object an ID. We can call that ID an access token.
Every time the user attempts to access any object, the OS checks to see whether or not the particular access token has permission (in NT this is known as a privilege) to access the object in question. This means that NT gives us the ability to control access for any access token all the way down to the object level. Incredible! But it doesn't stop there. Controlling access at the object level offers a lot of security potential, but we can actually reduce the grain of access control further. Consider that we can also specify in the privileges exactly how this particular access token may interact with the object in question. One access token may have read-only privileges while another access token, representing the user sitting just to the right, may have write, execute, or some other combination of interaction privileges. Powerful! If you agree that this level of control is really something you may be amazed to learn that, just to tie a bow on the whole security process, NT will faithfully keep track of every interaction every user has made with every object. How's that for a load of wood?
The Windows NT security model gives us a lot of raw material to work with, but it won't do anything if we don't use it correctly. This is not a book about NT administration, so I am not going to rehash the remove the guest account, etc stuff. If you are responsible for that type of thing, then I am sure you have several good books on the subject. Instead, I am going to touch a couple of basic topics that we will build on during the programming and application development portions of the book. I will try to approach the security topic with the same common sense approach that we applied to the fault tolerance issue earlier. In other words, we will try to identify some security issues, and then we will consider how to deal with them. Rather than trying to end up with some rigid list of dos and don'ts we should try to develop something akin to a common sense approach that we can use whenever we have to deal with a security issue.
Charity starts at home. So, before we venture off into the dark realm of Bastion Hosts, Belt & Suspender firewalls, Point-to-Point-Tunneling-Protocol and the like, we need to take an inventory of our efforts to control access to resources on our site. This internal examination is a must, before we even begin to consider the dangers from without.
It is possible to configure Windows NT so that the most potentially dangerous activities must be performed at the system console. In other words, the person who wishes to inflict harm on the system by performing these activities must be physically sitting at the computer to gain access to the objects on that computer. This means that we can exercise a high degree of control over a system's resources simply by limiting the number of individuals that have physical access to the mission-critical servers. Our first line of defense is to limit the individuals that can gain access to the equipment.
A common sense rule might state that the enterprises valuable resources are safer when they are physically out of the reach of the bad guys. Most organizations wouldn't dream of leaving 50 or 100 thousand dollars in cash lying around in a room without a lock. But, it is not unusual for the same organizations to have mission-critical servers sitting in rooms with minimal or no physical security.
If a resource is valuable enough to require security, lock it up.
When a user attempts to log onto any Windows NT workstation, the user is prompted for a username and password. The combination of username and password are run through a reportedly one-way algorithm that changes the unique combination into something, let's call it an authorization string. This is more important than it sounds. It means that this authorization string is the only thing that passes through the network, not the password or username. Common sense tells us that the less exposure something as important as a username/password combination gets, translates into less chances for the bad guys to be able to steal or otherwise commandeer it. Anyway, the authorization string works like this. The string is compared against the security accounts manager (SAM) file that is stored in the
Winnt\System32\Configdirectory of one of the domain (backup or primary) controllers on the network. If the authorization string is found, then the user is authenticated. If the authorization string is not found, then the user is denied all access. What this means to us is that we don't have to reinvent this wheel when we are designing applications for our enterprise. Instead, we can build upon the strong foundation Windows NT offers us.
Remember the Windows NT security model. Once a user has successfully logged onto a Windows NT system, the system believes that the user is whoever the user said he or she was. From that point on, the system vouches for the user acknowledging that the user has been verified. It says in essence, "this user's credentials are in order." It does that by assigning every user a unique token, which represents the user from that point on. Every action that the token initiates is funneled through the security reference monitor.
This system will only allow the user access to those objects that the user has privileges for. And, as we learned earlier, it can also be used to grant only specific types of access to specific users. If the object we need to provide security for is important enough, we can also track every access or attempted access of that object. In short, Windows NT allows us to control the access of every user right down to the way that user can interact with any particular object.
When we design an application to run on an enterprise, we should be able to count on Windows NT security. For the most part, we can. The only thing we really can't count on is that the original logon was honest. In other words, the Windows NT security system is powerful, but it is not omniscient. All it can do is to compare the username/password combination to see if it matches the SAM file. There is no way for the system to know whether or not the user at the console is who the user says he is. This means that the biggest improvement we can make to the overall security of a system is to ensure that each user makes it difficult to decipher his or her password. Microsoft offers a free application that can enable the system administrator to enforce a very complex password policy. If an enterprise is serious about security, after controlling physical access to the machines, this is the next step to take.
Once we have done our part to limit physical access to machines and our part to train, or force if necessary, our users to use strong password protection, it is almost like we have hired Microsoft to construct the foundation for our applications' security model. We can off-load a big portion of our security issues right onto Windows NT. We can safely begin to design an application security model that begins with the premise that Windows NT has authenticated the user. If we accept Windows NT's acknowledgement that the user's credentials are in order, we don't need to worry about keeping track of application-level passwords. This by itself plugs up a huge security hole. I cannot tell you how many assignments I have been on where I found tables with user logon IDs and unencrypted passwords right out in the open. Anyway, we can use NT Security to filter out the non-authenticated users. In Chapters 9 through 16, we will cover a myriad of ways that we can design our own objects to build on Windows NT's verification of credentials. We will examine ways that we can extend the security model right down to the individual properties and methods our objects are constructed from.
As I have already said, we can instruct Windows NT to log every activity that an individual attempts while logged onto the system, or even while that individual is attempting to log on to the system. This is powerful stuff, but this book is really not about NT administration, so I won't bore you with ways that we can use NT's audit trails. This book is about programming. Therefore, I will tell you that when we learn to build enterprise caliber objects, we extend NT's audit trail standard right into each object we build. In other words, every time a user changes or attempts to change the data in our system, we will create an audit trail that shows who did it, when they did it, where they did it, and what they did. As an added measure of protection, we also include the ability to undo anything that anyone has done.
Windows NT offers a robust security model that we will utilize and emulate in every way possible. As you will see, this technique requires us to design security into every object in our system. Rather than attempting to plug up or cover up security holes, we will strive to eliminate them during the design phase. This will result in a security system that gets ever more difficult to breach as the intruder gets closer to the center – or data – of our system. In other words, while the occasional dedicated cracker might penetrate some of our outer defenses, that interloper should be faced with increasingly more difficult challenges as he or she attempts to further breach the system. Even though our security net is designed to get tighter as we get further into the system, we still won't make it easy for the bad guys to get through the outer layer of armor. We will control external access.
Almost everyone knows that Windows NT has earned a C2 security rating, but I don't think that most people know that it earned its C2 rating configured as a stand-alone computer without a network connection or a floppy drive. I was surprised too! I bring that up, because what it means is that as soon as we connect any two computers together, we have placed ourselves at a higher level of jeopardy. Moreover, just so we are clear as to the extent of the problem, I want to point out that what we are doing as an industry, is working feverishly to ensure that every computer in the world is connected to every other computer in the world. Now that you have a sense of the overall scope of the problem, we can begin to tackle the external side of the access equation. To most people these days, when we talk about external threats, their minds immediately shift to questions concerning the Internet and its cousins the Intranet and the Extranet.
An Intranet is really just those portions of our LAN or WAN that are connected using the TCP/IP protocol. So, the external access risk we face here is really handled by standard NT Security in exactly the same manner we talked about earlier under the heading Internal Access.
There is one area of the WAN that we should pay special attention to though. That is the Remote Access Services (RAS). This service does use standard Windows NT security model, so it treats those users as though they were sitting at any other desk in the office. However, it may not really be safe to do this, because the users that access the system in this way do not go through any of the ordinary physical security measures most places of business employ. In other words, as long as the users are on site, we can employ physical mechanisms to keep out unauthorized individuals. We lock buildings. We place guards at the door. We just sense that if the guy sitting in John's desk is not John, that something may be awry. It is not possible to do this type of human reasoned security checking with RAS users. Other than that, we treat the Intranet proper, as just an extension of the network.
All over the world, organizations are rushing to get at least part of their resources out onto the Internet. It is great for business. It offers an excellent way for companies to deliver cost efficient 24 hour a day customer service. I can track my packages on FedEx web site without either one of us paying a dime for the long distance charges. I regularly purchase software over the Internet, and I collaborated with my editors at Wrox over the Internet. If you want the most up-to-date copy of the sample code in this book you can get it by logging onto Wrox's web site. Make no mistake about it. That is the direction, and I think its great!
But it means that organizations need to consider the external access security issues on this front. I wouldn't even want to attempt to list the external access hazards; it would take too long, and I could only be certain that I missed at least some of them. There are of course other political issues concerning what data should be open to the general public and so on. That is a job for a bureaucrat so I won't go there. As for the protection of our precious data that the politicians expose to the general public, we will cover the steps that we can take to ensure data safety when we learn to construct Enterprise Caliber Data Objects. What we will do here is to look at the best investment any organization can make when considering placing any of their resources on the Internet – the firewall.
The firewall works very much like the name implies. It places an impenetrable barrier (a wall of fire) around an organization's internal network assets. But it really is more like a very secure passageway rather than an impassable barrier. It picks and chooses exactly which data transmissions it will allow into and out of the network it is charged to protect. And, rather than being something like a single device, it is actually a series of devices and processes that work together to ensure that the firewall filters out the packets (a small amount of information) it is not authorized to transfer across the boundary that the firewall delineates. This means that in order to understand the concept of a firewall, we really need to understand the devices, processes, and policies that make up the firewall.
The simplest types of firewalls are really nothing more than a packet-filtering mechanism – this can be accomplished with just about any router capable of screening packets. It works like this: all of the external network traffic is directed to a single point of penetration that exposes the protected network. On a hardware level, this point of penetration is a device called the screening router. As the name implies, the screening router is used to filter out which IP addresses are allowed to pass through the point of penetration:
Of course, I wouldn't want the job of updating the list of IP addresses that are allowed to pass through this point of penetration. A true enterprise-level solution requires something a little more manageable. For most enterprises, the real job of managing this list of acceptable addresses falls to the next device in our firewall, the Bastion Host.
A Bastion Host is really just a secure server that is placed at the protected network's point of penetration, often in front of the screening router. In most cases the screening router is still used, but in these cases it serves as a gateway to the single Bastion Host.
Of course, since the Bastion Host is really a server it is easier to program it, or configure it, for the job of managing the list of IP addresses than can be allowed to access the protected network. Indeed, the Bastion Host can use any set of rules that we select in order to do its acceptance and rejection of users.
Our firewall is no longer limited to simple IP address filtering. We can handle that portion of the access problem with looser controls on the screening router and apply whatever level of control we choose at the Bastion Host. Allow me to suggest that the Windows NT security model offers an excellent model for the control of authorized users and their access of resources (objects) deployed across the enterprise. Yeah, that means that we can use a Windows NT Domain controller as a Bastion Host and that we can use exactly the same Windows NT security that we employ to protect our enterprise, to protect the outside world's access via the Internet. We can control every user's access to every object in the system. Of course, when we are talking about the Internet, we have to realize that some of the users will not members of our domain. We can use Windows NT to grant these users access to some portion of our resources, but we need to realize that this places some of our resources at risk. This realization is responsible for one of the finest examples of external security currently available – the Belt and Suspenders firewall.
Belt & Suspenders Firewall
Perhaps the most accepted firewall system is the Belt and Suspenders firewall. If you look at it the image below, you should notice why it got its name. In case it isn't obvious, the diagonally striped section makes a convincing belt and the two lines that are made up from the screening routers to the firewall could be mistaken for suspenders. That's how I remember it anyway. I think the name actually originated as a tribute to its high level of security. If you look carefully, you will notice that it is really more like two firewalls than one, as it has a dual screening router design. This added protection makes it as safe as wearing both a belt and suspenders.
It works very much like the simpler Bastion Host/screening router solution. In fact, the first firewall is actually those two components connected to the Internet. The second firewall is given by the combination of the Bastion Host and the second screening router that connects to the protected network. The striped area that I referred to as a belt, is really called the demilitarized zone (DMZ). It is almost as though, in this section of our network between the two screening routers, we have called a truce with the external threats. We took our best shot to protect some of our assets with the first screening router. The machines that live in the demilitarized zone are almost like sacrificial lambs. We place them in somewhat of a less protected area than the rest of our network assets. It is the price we pay for access to the Internet. Notice that one of the servers in the demilitarized zone is the RAS server. Many organizations do this. I think that they have an intuitive sense of the slightly more risky nature of the RAS portion of the WAN. Personally, I don't understand this. It seems to me that if someone can logon by impersonating another user, Windows NT will allow that impersonator to fool all portions of the system, no matter if the RAS server is in the DMZ or not.
Extranet – Virtual Private Network
The term Extranet seems to come and go. I almost think people are beginning to get so used to the idea of the Internet – making purchases with credit cards etc. – that they forget how open and dangerous the Internet really can be. An Extranet (also known as a Virtual Private Network, VPN), is an extension, or perhaps I should say implant, that strives to make the Internet a safe place between two specific points. It uses a special protocol called Point-to-Point-Tunneling-Protocol (PPTP) to carve out a private channel between two specific nodes in the Internet:
If you look at the image shown here, the outermost communication casing (the one that is cut away) represents the standard connection between the two Internet nodes. Within that casing, you should notice another section of "wire" that appears to be flexible, maybe like an accordion. This offers a something of a visual representation of what the Point-to-Point-Tunneling-Protocol does. It is almost as though PPTP pushes the accordion-like structure through the main connection – it drills a tunnel or private channel. Once this private channel has been opened, the two nodes can communicate freely within this channel safe from prying eyes.
The way PPTP actually works is really a pretty good example of good object-oriented programming. It can take a different type of network protocol NetBEUI, for instance, and encapsulate it in a standard IP packet. I think of it like an M&M candy where the IP packet is the candy coating that stops the chocolate inside from melting on your hands. Anyway, the interface just below the IP packet contains enough information to allow Windows NT to make some security decisions concerning the safety of the underlying contents. If it passes all of the NT security barriers, then the packet is opened and the information is allowed to enter the private network allowing the users to perform tasks that they might not have been able to do without the other protocol.
We can use this PPTP to create virtual private networks, but at this time, it still requires some special equipment. The client must be equipped with a Front End Processor. The Front End Processor and the PPTP-capable server transmit encrypted data over the PPTP. The combination of the encrypted data and NT security allow any authorized user a private, secure, personal link to the server using the Internet as the trunk line. This means the organization can save money on long distance telephone calls and or the expense of dedicated WAN lines.
In this chapter, we began to look at a number of different techniques you can use to ensure that whenever any of your customers needs an enterprise resource, that resource will be available. In general, we found that this type of high-availability system owes its robustness to some very basic design decisions we make concerning fault tolerance. Then we began a long and deliberate dissection of the processes that make up applications to learn how we could take advantage of those separations to design a system that was truly scalable. Finally, we took a look at some basic security concepts that we could employ to safeguard our users' interests in the enterprise. Of course, this chapter only represents the beginning of our design process.
In the next chapter, we will take some of the information that we considered in this chapter and see if we can mirror what we found about processes into an actual physical system. In the next chapter, we will work through the design of an enterprise caliber server farm.
We at Microsoft Corporation hope that the information in this work is valuable to you. Your use of the information contained in this work, however, is at your sole risk. All information in this work is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Microsoft Corporation. Microsoft Corporation shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages. All prices for products mentioned in this document are subject to change without notice.