Export (0) Print
Expand All
3 out of 4 rated this helpful - Rate this topic

IPSec Architecture

By Naganand Doraswamy , Dan Harkins

Chapter 4 of IPSec – The New Security Standard for the Internet, Intranets and Virtual Private Networks (Prentice Hall, PTR)

On This Page

The IPSec Roadmap
IPSec Implementation
IPSec Modes
Security Associations
SA Management


This chapter discusses the IPSec architecture in detail. This includes various components of IPSec, how they interact with each other, the protocols in the IPSec family, and the modes in which they operate.

The IPSec working group at the IETF has defined 12 RFCs (Request for Comments). The RFCs define various aspects of IPSec - architecture, key management, base protocols, and the mandatory transforms to implement for the base protocols. This chapter concentrates mostly on the architecture. The base protocols and the key management protocols are discussed in greater detail in other chapters.

The IPSec Roadmap

The IPSec protocols include - AH, ESP, IKE, ISAKMP/Oakley, and transforms. In order to understand, implement, and use IPSec, it is necessary to understand the relationship among these components. The IPSec roadmap defines how various components of IPSec interact with each other. This is shown in Figure 4.1.

Figure 4.1: IPSec Roadmap (this figure has been reproduced from the draft with permission of the authors)

Figure 4.1: IPSec Roadmap (this figure has been reproduced from the draft with permission of the authors)

IPSec is a suite of protocols and it is important to understand how these protocols interact with each other and how these protocols are tied together to implement the capabilities described by the IPSec architecture.

The IPSec architecture, as described in the previous chapter, defines the capabilities the hosts and gateways should provide. For example, IPSec architecture requires the host to provide confidentiality using ESP, and data integrity using either AH or ESP and antireply protection. However, the architecture document does not specify the header formats for these protocols. The architecture discusses the semantics of the IPSec protocols and the issues involved in the interaction among the IPSec protocols and the rest of the TCP/IP protocol suite.

The ESP and the AH documents define the protocol, the payload header format, and the services they provide. In addition these documents define the packet processing rules. However, they do not specify the transforms that are used to provide these capabilities. This is because the new transforms can be defined when the algorithms used by the older transforms are proved to be cryptographically insecure. However, this does not mandate any change to the base protocols.

The transforms define the transformation applied to the data to secure it. This includes the algorithm, the key sizes and how they are derived, the transformation process, and any algorithmic-specific information. It is important to be specific about the necessary information so that different implementations can interoperate. Let us consider the DES-DBC transform that is defined for ESP. If we do not specify how the Initialization Vector is derived, the two implementations end up deriving the Initialization Vector in different ways, and they will never be able to interoperate.

IKE generates keys for the IPSec protocols. IKE is also used to negotiate keys for other protocols that need keys. There are other protocols in the Internet that require security services such as data integrity to protect their data. One such example is OSPF (Open Shortest Path First) routing protocol. The payload format of IKE is very generic. It can be used to negotiate keys for any protocol and not necessarily limit itself for IPSec key negotiation. This segregation is achieved by separating the parameters IKE negotiates from the protocol itself. The parameters that are negotiated are documented in a separate document called the IPSec Domain of Interpretation.

An important component that is not yet a standard is "policy." Policy is a very important issue because it determines if two entities will be able to communicate with each other and, if so, what transforms to use. It is possible, with improperly defined policies, for two sides to be unable to communicate with each other.

The issues with policy are representation and implementation. Representation deals with definition of policy, storage, and retrieval. The IETF is currently working on defining the policy standards. The implementation addresses the application of policy for actual communication. It is important that the keys established by the key management protocol are applied appropriately in the communication. It is equally important to apply the appropriate filters and rules. This chapter discusses the implementation issues of policy and how these rules are applied to the IPSec traffic. The policy representation is discussed later.

IPSec Implementation

IPSec can be implemented and deployed in the end hosts or in the gateways/routers or in both. Where in the network IPSec is deployed depends on the security requirements of the users.

This section discusses the capabilities and implications of implementing IPSec in various network devices (hosts and routers). There are merits in implementing IPSec in both routers and end hosts as they address different problems. The host implementation is most useful when security is desired end to end. However, in cases when security is desired over a part of a network, router implementation is desirable. This includes VPNs and intranets.

Host Implementation

The proper definition of a host in this context is the device where the packet is originating. The host implementation has the following advantages:

  • Provides security end to end

  • Ability to implement all modes of IPSec security

  • Provides security on a per flow basis

  • Ability to maintain user context for authentication in establishing IPSec connections

Host implementations can be classified into:

  1. Implementation integrated with the operating system (OS). We call it host implementation (for lack of a better term!).

  2. Implementation that is a shim between the network and the data link layer of the protocol stack. This is called the "Bump in the Stack" implementation.

OS Integrated

In the host implementation, IPSec may be integrated with the OS. As IPSec is a network layer protocol, it may be implemented as part of the network layer as shown in Figure 4.2. IPSec layer needs the services of the IP layer to construct the IP header. This model is identical to the implementation of other network layer protocols such as ICMP.

Figure 4.2: IPSec stack layering

Figure 4.2: IPSec stack layering

There are numerous advantages of integrating the IPSec with the OS. A few key advantages are listed below.

  • As IPSec is tightly integrated into the network layer, it can avail the network services such as fragmentation, PMTU, and user context (sockets). This enables the implementation to be very efficient.

  • It is easier to provide security services per flow (such as a Web transaction) as the key management, the base IPSec protocols, and the network layer can be integrated seamlessly.

  • All IPSec modes are supported.

Bump in the Stack

For companies providing solutions for VPNs and intranets, OS integrated solution has one serious drawback. On the end hosts, they have to work with the features provided by the OS vendors. This may limit their capabilities to provide advanced solutions. To overcome this limitation, IPSec is implemented as a shim, and inserted between the network and the data link layer as shown in Figure 4.3. This is commonly referred to as Bump in the Stack (BITS) implementation.

Figure 4.3: BITS IPSec stack layering

Figure 4.3: BITS IPSec stack layering

As you may notice, the major issue in this implementation is duplication of effort. It requires implementing most of the features of the network layer, such as fragmentation and route tables. Duplicating functionality leads to undesired complications. It becomes more difficult to handle issues such as fragmentation, PMTU, and routing.

An advantage of BITS implementation is the capability of an implementation to provide a complete solution. Vendors providing integrated solutions such as firewalls, prefer to have their own client as the OS vendor and may not have all the features required to provide a complete solution.

Router Implementation

The router implementation provides the ability to secure a packet over a part of a network. For example, an organization may be paranoid about the Internet and not its own private network. In this case, it may want to secure only those packets destined to the geographically distributed branch as these packets traverse the Internet to build its VPN or intranet. The IPSec implementation provides security by tunneling the packets.

The router implementation has the following advantages:

  • Ability to secure packets flowing between two networks over a public network such as the Internet.

  • Ability to authenticate and authorize users entering the private network. This is the capability that many organizations use to allow their employees to telecommute over the Internet to build its VPN or intranet. Previously, this was possible only over dial-ups (dialing through modem directly into the organization).

There are two types of router implementation:

  • Native implementation: This is analogous to the OS integrated implementation on the hosts. In this case, IPSec is integrated with the router software.

  • Bump in the Wire (BITW): This is analogous to BITS implementation. In this case, IPSec is implemented in a device that is attached to the physical interface of the router. This device normally does not run any routing algorithm but is used only to secure packets. BITW is not a long-term solution as it is not viable to have a device attached to every interface of the router.

The network architectures for these implementations is shown in Figure 4.4.

Figure 4.4: A Native Implementation deployment architecture

Figure 4.4: A Native Implementation deployment architecture

Figure 4.4: B BITW deployment architecture

Figure 4.4: B BITW deployment architecture

The IPSec implementation on routers has many implications on the packet-forwarding capabilities of the router. The routers are expected to forward packets as fast as possible. In fact, we are already seeing core routers that can forward up to 30 million packets per second! Although IPSec may not be used in the core of the Internet, the implementations should still be concerned about efficiency. The packets that do not require security should not be affected because of IPSec. They should still be forwarded at normal rates. Many implementations make use of some hardware assists to perform public key operations, random number generation, encryption/decryption, and calculating hashes. There are specialized chipsets that assist the basic router hardware with security operations.

Another issue with router implementation is IPSec contexts. Memory on the routers is still a scarce commodity, although this is changing fast with memory prices falling rapidly. As the router has to store huge routing tables and normally does not have huge disks for virtual memory support, maintaining too many IPSec contexts is an issue.

IPSec Modes

We have talked about IPSec in transport mode and tunnel mode without explaining when and how IPSec protocols are used in these modes. In this section, we describe how the IPSec protocols, AH and ESP, implement the tunnel and transport modes. There are four possible combinations of modes and protocol: AH in transport mode, AH in tunnel mode, ESP in transport mode, and ESP in tunnel mode. In practice, AH in tunnel mode is not used because it protects the same data that AH in transport mode protects.

The AH and ESP header do not change between tunnel or transport mode. The difference is more semantic in nature‹what it is they are protecting, IP packet or an IP payload. The guidelines for deciding what mode to use and some examples of using IPSec in various modes, is discussed in later chapters.

Transport Mode

In transport mode, AH and ESP protect the transport header. In this mode, AH and ESP intercept the packets flowing from the transport layer into the network layer and provide the configured security.

Let us consider an example. In Figure 4.5, A and B are two hosts that have been configured so that all transport layer packets flowing between them should be encrypted. In this case, transport mode of ESP is used. If the requirement is just to authenticate transport layer packets, then transport mode of AH is used.

Figure 4.5: Hosts with Transport ESP

Figure 4.5: Hosts with Transport ESP

When security is not enabled, transport layer packets such as TCP and UDP flow into the network layer, IP, which adds the IP header and calls into the data link layer. When security in transport layer is enabled, the transport layer packets flow into the IPSec component. The IPSec component is implemented as part of the network layer (when intergrated with OS). The IPSec component adds the AH, ESP, or both headers, and invokes the part of the network layer that adds the network layer header.

The transport mode of IPSec can be used only when security is desired end to end. As stated earlier, the routers look mostly at the network layer in making routing decisions and the routers do not and should not change anything beyond the network layer header. Inserting transport mode IPSec header for packets flowing through a router is a violation of this rule.

When both AH and ESP are used in transport mode, ESP should be applied first. The reason is obvious. If the transport packet is first protected using AH and then using ESP, the data integrity is applicable only for the transport payload as the ESP header is added later on as shown in Figure 4.6. This is not desirable because the data integrity should be calculated over as much data as possible.

Figure 4.6: Packet format with ESP and AH

Figure 4.6: Packet format with ESP and AH

If the packet is protected using AH after it is protected using ESP, then the data integrity applies to the ESP payload that contains the transport payload as shown in Figure 4.7.

Figure 4.7: Packet format with AH and ESP

Figure 4.7: Packet format with AH and ESP

The transport mode for BITS implementation is not as clean, as the ESP and AH headers are inserted after the IP payload is constructed. This implies the BITS implementation has to duplicate the IP functionality because it has to recalculate the IP checksum and fragment the packet if necessary. Many BITS implementations may not support transport mode but support only tunnel mode.

Tunnel Mode

IPSec in tunnel mode is normally used when the ultimate destination of the packet is different from the security termination point as shown in Figure 4.8 or in case of BITS or BITW implementations. The tunnel mode is used in cases when security is provided by a device that did not originate packets - as in the case of VPNs - or when the packet needs to be secured to a destination that is different from the actual destination.

Figure 4.8: IPSec in tunnel mode

Figure 4.8: IPSec in tunnel mode

It is also used when a router provides security services for packets it is forwarding. The operation of tunnel mode is discussed in detail in the IPSec implementation chapter.

In the case of tunnel mode, IPSec encapsulates an IP packet with IPSec headers and adds an outer IP Header as shown in Figure 4.9.

Figure 4.9: IPSec tunneled Mode packet format

Figure 4.9: IPSec tunneled Mode packet format

An IPSec tunneled mode packet has two IP headers‹inner and outer. The inner header is constructed by the host and the outer header is added by the device that is providing the security services. This can be either the host or a router. There is nothing that precludes a host from providing tunneled mode security services end to end. However, in this case there is no advantage to using tunnel mode instead of transport mode. In fact, if the security services are provided end to end, transport mode is better because it does not add an extra IP header.

IPSec defines tunnel mode for both AH and ESP. IPSec also supports nested tunnels. The nested tunneling is where we tunnel a tunneled packet as shown in Figure 4.10A.

Figure 4.10: A Nested tunnel example

Figure 4.10: A Nested tunnel example

Figure 4.10: B Nested packet format

Figure 4.10: B Nested packet format

In this example, host A is sending a packet to host B. The policy says it has to authenticate itself to router RB. In addition, there is a VPN between the two networks bordered by RA and RB. The packet seen by router RB is shown in Figure 4.10B. The outermost header is the tunneled ESP packet. It is carrying a tunneled AH packet. The tunneled AH packet is carrying the IP packet destined for the host B generated by host A.

Figure 4.11: A Valid tunnel

Figure 4.11: A Valid tunnel

Figure 4.11: B Invalid tunnel

Figure 4.11: B Invalid tunnel

The requirement for the tunnel is that inner header is completely encompassed by the outer header. The valid and invalid tunnels examples are shown in Figures 4.11A and B.

The example shown in Figure 4.11A is valid as the inner tunnel (tunnel 2) is completely encompassed by tunnel 1. The example shown in Figure 4.11B is invalid because neither tunnel completely encompasses the other. To understand why this is invalid, let us trace the packet flow. After RA constructs the tunneled packet, the packet format is as shown in Figure 4.12A.

Figure 4.12: A Tunneled packet

Figure 4.12: A Tunneled packet

When the packet reaches RB, it tunnels the packet to host C. The packet format when the packet leaves RB is shown in Figure 4.12B.

Figure 4.12: B Invalid tunnel packet

Figure 4.12: B Invalid tunnel packet

Clearly, this is incorrect because the packet now reaches host C before it reaches RC. When the packet reaches host C, it processes the AH header. When the second IP header is exposed, the host drops the packet because the destination is RC and not itself. Nested tunnels are difficult to build and maintain and should be used sparingly.

Security Associations

The Security Associations, or SAs as they are normally referred to in IPSec terminology, form the basis for IPSec. The SAs are the contract between two communicating entities. They determine the IPSec protocols used for securing the packets, the transforms, the keys, and the duration for which the keys are valid to name a few. Any IPSec implementation always builds an SA database (SADB) that maintains the SAs that the IPSec protocols use to secure packets.

The SAs are one way, i.e., simplex. If two hosts, A and B, are communicating securely using ESP, then the host A will have an SA, SAout, for processing outbound packets and will have a different SA, SAin, for processing the inbound packets. The host B will also create two SAs for processing its packets. The SAout of the host A and the SAin of the host B will share the same cryptographic parameters such as keys. Similarly, SAin of the host A and the SAout of the host B will share the same cryptographic parameters. As SAs are unidirectional, a separate table is maintained for SAs used for outbound and inbound processing.

The SAs are also protocol specific. There is an SA for each protocol. If two hosts A and B are communicating securely using both AH and ESP, then each host builds a separate SA for each protocol.

There is another component in the IPSec architecture called the security policy database (SPD). The SPD works in conjunction with the SADB in processing packets. The policy is an extremely important component of IPSEC architecture. The policy defines the security communications characteristics between two entities. It defines what protocols to use in what modes and the transforms to use. It also defines how the IP packets are treated. This is discussed in detail in later sections.

Security Parameter Index (SPI)

The SPI is a very important element in the SA. An SPI is a 32-bit entity that is used to uniquely identify an SA at the receiver. It was mentioned before that the security context or SA is a contract between two hosts communicating securely and indicates the parameters, such as keys and algorithms. However, there has to be some mechanism for the source to identify which SA to use to secure the packet and for the destination to identify which SA to use to check the security of the received packet. The source identifies the SA by using the selectors. However, the destination does not have access to all the fields in the selectors as some of the fields in the selectors belong to the transport layer.

To solve the problem of identifying the SA on the destination, the SPI that uniquely identifies the SA on the destination is sent with every packet. The destination uses this value to index into the receiving SADB and fetch the SA. The obvious questions are who guarantees the uniqueness of the mapping between the SPI and SA and what is the domain of uniqueness on the destination for each protocol‹global, per source, or per address on the host. It is up to receiver/destination to guarantee this uniqueness. It is a requirement to maintain a separate SPI domain for each protocol. The destination can use any consistent mechanism to guarantee uniqueness inside each domain. The IPSec architecture specifies that the <spi, destination address> in the packet should uniquely identify an SA.

The receiver allocates the SPI that is stored as part of the SA on the sender. The sender includes this in every packet under the assumption that the receiver can use this to uniquely identify the SA. If the receiver does not guarantee uniqueness, packets will fail security checks as invalid keys and transforms may be used.

The sending host uses the selectors to uniquely index into the sending SADB. The output of this lookup is an SA that has all the security parameters negotiated, including the SPI. The host that allocates the SPI guarantees uniqueness. The SPI is reused once the SA expires but one is guaranteed at any point the mapping between <spi, dst>, and SA is one to one. The src address is used in cases where the host is multihomed, that is, a host with more than one IP interface. This can be because there is more than one network card on the card or because of the fact that multiple IP interfaces are configured on the same network card (the host has multiple IP addresses). In this case, it is possible that the index <spi, dst> is not unique and src is used to resolve the ambiguity.

The SPI is passed as part of AH and ESP headers. The receiving host uses the tuple <spi, dst, protocol> (where dst is the destination address in the IP header) to uniquely identify the SA. It is possible to use the source address in addition to <spi, dst, protocol> to uniquely identify an SA to conserve the SPI space. However, this is not part of the standards and is something specific to an implementation.

SA Management

The two most important tasks of the SA management are creation and deletion. The management of SAs can be either manual or through an Internet standard key management protocol such as IKE. The SA management requires an interface for the user applications (which includes IKE) to communicate with kernel to manage the SADB. The management aspect is discussed in greater detail in the chapter on policy.


The SA creation is a two-step process‹negotiating the parameters of the SA and updating the SADB with the SA.

Manual keying is mandatory to support and it was used extensively during the initial development and testing of IPSec. In manual keying, two sides agree on the parameters of the SA offline, either by phone or over e-mail (although unsecured e-mail is dangerous!). The process of allocating the SPI, negotiation of parameters is all manual. Needless to say, this process is error prone, cumbersome, and insecure. The other limiting factor is that these SAs never expire. Once they are created they stay good until they are deleted. Manual keying is a useful feature for debugging the base IPSec protocols when the key management protocols are failing. However, with a stable key management protocol, the use of manual keying is questionable.

In an environment where IPSec is deployed, the SAs are created through an Internet standard key management protocol such as IKE. IKE is invoked by the IPSec kernel when the policy mandates that the connection should be secure and it cannot find the SA. IKE negotiates the SA with the destination or intermediate host/router, depending on the policy, and creates the SA. Once the SA is created and added to the SADB, secure packets start flowing between the two hosts.

In the previous sections, we discussed nested or chained implementations of IPSec. For example, a host creates a transport AH to end to end, but also creates a tunneled ESP to the gateway/firewall. In this case, for the packet to be processed properly, the source has to create two SAs, one for the gateway and another for the end host. When the policy requires establishment of multiple SAs for two hosts to communicate securely, the collection of SAs is called SA bundle.


The SA is deleted for various reasons:

  • The lifetime has expired.

  • The keys are compromised.

  • The number of bytes encrypted/decrypted or authenticated using this SA has exceeded a certain threshold set by the policy.

  • The other end requests that the SA be deleted.

The SAs can be deleted manually or through the IKE. It is important to renew or refresh keys in security to reduce the chance of someone breaking the system. IPSec does not provide the ability to refresh keys. Instead, we have to delete existing SA and negotiate/create a new SA. Once the SA is deleted, the SPI it was using can be reused.

To avoid the problem of stalling the communication, a new SA is negotiated before the existing SA expires. For a small duration of time, until the soon-to-expire SA is deleted, the two entities have multiple SAs that can be used for secure communication. However, it is always desirable to use the newly established SA instead of the older SA.


The SA maintains the context of a secure communication between two entities. The SA stores both protocol-specific and generic fields. This section discusses the fields that are used by both AH and ESP. The protocol-specific fields are discussed in AH and ESP chapters. These fields are used in processing each IP packet. Some of the fields are used for outbound processing, some for inbound processing, and some for both, depending on the usage of the field. Certain fields are updated when the SA is used to process a packet. Semantics associated with the parameters in an SA are discussed below.

Sequence Number

The sequence number is a 32-bit field and is used in outbound processing. The sequence number is part of both AH and ESP header. The sequence number is incremented by 1 every time the SA is used to secure a packet. This field is used to detect replay attacks by the destination. When the SA is established this field is set to 0. Normally, SAs are renegotiated before this field overflows as it is unsafe to send more than 4 Giga (4,000,000,000) packets using the same keys.

Sequence Number Overflow

This field is used in outbound processing and is set when the sequence number overflows. The policy determines if the SA can still be used to process additional packets.

Antireplay Window

This field is used in inbound processing. One of the concerns in networks today is replay attack. In replay attacks, applications get bombarded with replay packets. IPSec overcomes this by detecting packets replayed by rogue hosts. This is discussed in greater detail where the inbound processing of packets is described in the implementation chapter.


There is a lifetime associated with each SA beyond which the SA cannot be used. The lifetime is specified either in terms of number of bytes that has been secured using this SA or the duration for which the SA has been used or both. When the lifetime of the SA expires, it cannot be used anymore. In order to avoid the problem of breaking the communication when the SA expires, there are two kinds of lifetimes‹soft and hard. The soft lifetime is used to warn the kernel that the SA is about to expire. This allows the kernel to negotiate a new SA before the hard lifetime expires.


IPSec protocols can be used either in tunnel or transport mode. The payload is processed differently depending on the value of this field. This field is set to tunnel mode, transport mode, or a wild card. In the cases where this field is set to wild card, the information as to whether it is IPSec in tunnel or transport mode is gleaned from someplace else, that is, sockets. When this field is set to a wild card, it implies that the SA can be used either for tunnel or transport mode.

Tunnel Destination

For IPSec in tunnel mode, this indicates the tunnel destination‹the destination IP address of the outer header.

PMTU parameters

When IPSec is used in tunnel mode, it has to maintain the PMTU information so that it can fragment the packets accordingly. As a part of PMTU field, the SA maintains two values‹the PMTU and the aging field. This is discussed in greater detail in the implementation chapter.

Security Policy

The security policy determines the security services afforded to a packet. As mentioned earlier, all IPSec implementations store the policy in a database called the SPD. The database is indexed by selectors and contains the information on the security services offered to an IP packet.

The security policy is consulted for both inbound and outbound processing of the IP packets. On inbound or outbound packet processing, the SPD is consulted to determine the services afforded to the packet. A separate SPD can be maintained for the inbound and the outbound packets to support asymmetric policy, that is, providing different security services for inbound and outbound packets between two hosts. However, the key management protocol always negotiates bidirectional SAs. In practice, the tunneling and nesting will be mostly symmetric.

For the outbound traffic, the output of the SA lookup in the SADB is a pointer to the SA or SA bundle, provided the SAs are already established. The SA or SA bundle will be ordered to process the outbound packet as specified in the policy. If the SAs are not established, the key management protocol is invoked to establish the packet. For the inbound traffic, the packet is first afforded security processing. The SPD is then indexed by the selector to validate the policy on the packet. We will discuss this in greater detail when we talk about IPSec in action.

The security policy requires policy management to add, delete, and modify policy. The SPD is stored in the kernel and IPSec implementations should provide an interface to manipulate the SPD. This management of SPD is implementation specific and there is no standard defined. However, the management application should provide the ability to handle all the fields defined in the selectors that are discussed below.


This section defines the various selectors used to determine the security services afforded to a packet. The selectors are extracted from the network and transport layer headers.

Source Address

The source address can be a wild card, an address range, a network prefix, or a specific host. Wild card is particularly useful when the policy is the same for all the packets originating from a host. The network prefix and address range is used for security gateways providing security to hosts behind it and to build VPNs. A specific host is used either on a multihomed host or in the gateways when a host¹s security requirements are specific.

Destination Address

The destination address can be a wild card, an address range, a network prefix, or a specific host. The first three are used for hosts behind secure gateways.The destination address field used as a selector is different from the destination address used to look up SAs in the case of tunneled IP packets. In the case of tunneled IP packets, the destination IP address of the outer header can be different from that of the inner header when the packets are tunneled. However, the policy in the destination gateway is set based on the actual destination and this address is used to index into the SPD.


The name field is used to identify a policy tied to a valid user or system name. These include a DNS name, X.500 Distinguished Name, or other name types defined in the IPSec DOI. The name field is used as a selector only during the IKE negotiation, not during the packet processing. This field cannot be used as a selector during packet processing as there is no way to tie an IP address to a name presently.


The protocol field specifies the transport protocol whenever the transport protocol is accessible. In many cases, when ESP is used the transport protocol is not accessible. Under these circumstances, a wild card is used.

Upper Layer Ports

In cases where there is session-oriented keying, the upper layer ports represent the src and dst ports to which the policy is applicable. The wild card is used when the ports are inaccessible.

IPSec Processing

In this section, the processing of the IPSec packets, both inbound and outbound, is discussed briefly. The interactions between the kernel and the key management layer is discussed in the IPSec implementation chapter. The header processing, both IPv4 and IPv6, are discussed in the IPSec implementation chapter.

The IPSec processing is classified into outbound processing and inbound processing.

Figure 4.13: Outbound IPSec processing

Figure 4.13: Outbound IPSec processing


On outbound processing, the transport layer packets flow in to IP layer. The IP layer consults the SPD to determine the security services afforded to this packet. The input into the SPD is the selectors defined in the previous section. The output of the SPD is one of the following:

  • Drop the packet, in which case the packet is not processed and dropped.

  • Bypass security, in which case the IP layer adds the IP header to the payload and dispatches the IP packet.

  • Apply security, in which case, if an SA is already established, the pointer to it is returned. If SA is not established, then IKE is invoked to establish the SA. If the SAs are already established, SPD has a pointer to the SA or the SA bundle, depending on the policy. If the output of the policy mandates applying IPSec to the packets, the packets are not transmitted until the SAs are established.

The IPSec implementation waits until the SAs for this packet are established, if they are not already established. After the SAs are established, it processes the packet by adding the appropriate AH and ESP headers. The SAs have all the pertinent information and are ordered so that the IPSec headers are constructed appropriately. For example, let us consider the network shown in the Figure 4.13.

In this case, the host is tunneling a packet to the gateway using ESP but is authenticating to the end host B. The correct header is shown in Figure 4.14.

Figure 4.14: Packet format

Figure 4.14: Packet format

In this case, IKE establishes four SAs‹two for sending and two for receiving. As we are discussing the outbound processing, we will ignore the SAs established for processing inbound packets. The two outbound SAs are SA1 and SA2, where SA1 is the SA between A and the gateway and SA2 is the SA between the host and the destination. The ordering of IPSec processing is very important. If SA2 is applied after SA1, the packet is formed incorrectly. It is very important to maintain ordering in the SA bundle so that IPSec processing is applied in the correct order for outbound packets.

This section gave a very brief overview of the processing of the outbound packets. There are lot of other issues with constructing the header in handling fields such as sequence numbers and insertion of headers that is deferred to the implementation chapter.


The inbound processing differs from the outbound processing. On the receipt of the IP packet, if the packet does not contain any IPSec headers, the security layer checks the policy to determine how to process the packet. It indexes the SPD using the selector fields. The output of the policy will be one of three values‹discard, bypass, or apply. If the output of the policy is discard, the packet is dropped. If the output of the policy is apply, but SAs are not established, the packet is dropped. Otherwise, the packet is passed up to the next layer for further processing.

If the IP packet contains IPSec headers, the packet is processed by the IPSec layer. The IPSec layer extracts the SPI, the source addr, and the destination addr from the IP datagram. It indexes into the SADB using the tuple <SPI, dst, protocol> (additionally the source address is used, depending on the implementation). The protocol value is either AH or ESP. Depending on the protocol value, the packet is handled either by the AH or the ESP layer. After the protocol payload is processed, the policy is consulted to validate the payload. The selectors are used to retrieve the policy. The validation process consists of checking that the SA was used appropriately, that is, the source and destination in the SA corresponds to what the policy says and the SA is protecting the transport layer protocol it was supposed to protect. In case of tunneled packets, the source and destination selector fields are that of the inner header and not the outer header. Indexing into the SPD based on the outer source and destination values yields invalid results because the entry is constructed for the true source and destination and not the tunnel end point.

Let us consider the example where the gateway is tunneling a packet for host B to host A. On the host A, the policy say, for packets arriving from B it will have a tunneled ESP and the tunnel source will be the secure gateway. Indexing into the SPD using the gateway as source instead of host B is incorrect.

Once the IPSec layer validates the policy, it strips off the IPSec header and passes the packet to the next layer. The next layer is either a transport layer or a network layer. For example, if the packet is IP[ESP[TCP]]], then the next layer is a transport layer. If the packet is IP[AH[ESP[TCP]]] the next layer will be the IPSec layer that belongs to the network layer.


IPSec does not fragment or reassemble packets. On outbound processing, the transport payload is processed and then passed on to the IP layer for further processing. On inbound processing, the IPSec layer gets a reassembled packet from the IP layer.

However, as IPSec does add IPSec header, it impacts the PMTU length. If IPSec does not participate in PMTU discovery, the IP layer ends up fragmenting a packet as the addition of the IPSec header increases the length of the IP datagram beyond the PMTU.

It is important for IPSec to participate in the PMTU discovery process. This is discussed in greater detail in the chapter on IPSec implementation.


ICMP processing is critical to the operation and debugging of a network. When IPSec is used end-to-end, it does not impact ICMP. However, when IPSec is used in tunnel mode, it impacts ICMP and the operation of the network. The problem arises in the tunnel mode, particularly when the tunnel header is added by an intermediate gateway. This is because ICMP messages are required to send only 64 bits of the original header. When the gateway adds the tunneled header and the IPSec header, the inner IP header and hence the actual source is not present in the ICMP error message. The gateway will not be able to forward the message appropriately.

In order to handle ICMP error messages correctly, IPSec needs to maintain some state and perform extra processing. This is discussed in greater detail in the implementation chapter.

About the Authors:

Naganand Doraswamy is a senior principal engineer at Nortel Networks in Billerica, MA., and an active participant in the IETF and key industry panels on VPNs and IP security. He was a network security architect at Bay Networks (currently Nortel Networks) and is currently working on next-generation router architectures and protocols. He was the technical lead for IP Security at FTP Software.

Dan Harkins, formerly a senior software engineer in the Network Protocol Security Group at Cisco Systems, is currently a Senior Scientist at Network-Alchemy in Santa Cruz, CA, and is active in several IETF working groups. He wrote IPSec's standard Internet Key Exchange (IKE) key management protocol.

Copyright © Prentice Hall, PTR 1999. All rights reserved.

We at Microsoft Corporation hope that the information in this work is valuable to you. Your use of the information contained in this work, however, is at your sole risk. All information in this work is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Microsoft Corporation. Microsoft Corporation shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages.

Did you find this helpful?
(1500 characters remaining)
Thank you for your feedback
© 2014 Microsoft. All rights reserved.