PIM-SM Multicast Routing Protocol

This paper is an introduction to the PIM-SM multicast routing protocol, concentrating on version 2. It is intended for IT managers who are already familiar with multicasting, and who want an overview before reading the PIM-SM RFC. PIM-SM was designed to operate efficiently across wide area networks, where groups are sparsely distributed. It uses the traditional IP multicast model of receiver-initiated membership, supports both shared and shortest-path trees, is not dependent on a specific unicast routing protocol, and uses soft-state mechanisms to adapt to changing network conditions.

On This Page

Introduction
The PIM-SM Protocol
Examining the PIM-SM Protocol
Unresolved Issues
PIM-SM Control Messages
Summary

Introduction

Protocol Independent Multicast-Sparse Mode (PIM-SM) routes multicast packets to multicast groups, and is designed to efficiently establish distribution trees across wide area networks (WANs). PIM-SM is called "protocol independent" because it can use the route information that any routing protocol enters into the multicast Routing Information Base (RIB), or, as it is known in Windows terminology, the multicast view. Examples of these routing protocols include unicast protocols such as the Routing Information Protocol (RIP) and Open Shortest Path First (OSPF), but multicast protocols that populate the routing tables—such as the Distance Vector Multicast Routing Protocol (DVMRP)—can also be used. Sparse mode means that the protocol is designed for situations where multicast groups are thinly populated across a large region. Sparse-mode protocols can operate in LAN environments, but they are most efficient over WANs.

A sparse group can be defined as "one in which a) the number of networks or domains with group members present is significantly smaller than the number of networks/domains in the Internet, b) group members span an area that is too large/wide to rely on a hop-count limit or some other form of limiting the scope of multicast packet propagation, and c) the internetwork is not sufficiently resource rich to ignore the overhead of current [dense mode] schemes."1

In contrast, dense-mode protocols such as DVMRP and Multicast OSPF (MOSPF) are designed for situations where multicast groups are widely represented and bandwidth is plentiful. With these schemes, data packets and/or membership report information may be sent out unnecessarily on interfaces that don't lead to multicast sources or interested receivers; additionally, routers store the associated state for these uninterested nodes, which is also unnecessary. This overhead is acceptable when most hosts are interested in the data and there is enough bandwidth to support the flow of control messages, but is otherwise inefficient. PIM-SM assumes that no host wants data unless it is explicitly requested. PIM does, however, have a dense-mode counterpart (PIM-DM) that is interoperable with sparse mode.

PIM-SM was designed to support the following goals:

  • Maintain the traditional IP multicast service model of receiver-initiated multicast group membership. In this model, sources simply put packets on the first-hop Ethernet, without any signaling. Receivers signal to routers in order to join the multicast group that will receive the data.

  • Leave the host model unchanged. PIM-SM is a router-to-router protocol, which means that the hosts don't have to be upgraded, but that PIM-SM-enabled routers must be deployed in the network.

  • Support both shared and source distribution trees. For shared trees, PIM-SM uses a central router, called the Rendezvous Point (RP), as the root of the shared tree. All source hosts send their multicast traffic to the RP, which in turn forwards the packets through a common tree to all the members of the group. Source trees directly connect sources to receivers. There is a separate tree for every source. Source trees are considered shortest-path trees from the perspective of the unicast routing tables. PIM-SM can use either type of tree or both simultaneously.

  • Maintain independence from any specific unicast routing protocol (see above).

  • Use soft-state mechanisms to adapt to changing network conditions and multicast group dynamics. Soft-state means that, unless it is refreshed, the router's state configuration is short-term and expires after a certain amount of time.

The remainder of this paper expands on these points and shows examples of how PIM-SM operates. It concentrates on PIM-SM version 2, specified in RFC 2362, June 1998 (which has an experimental status), and points out the places where version 2 differs from version 1.

The PIM-SM Protocol

The PIM-SM protocol can be broken down into the following parts:

  • Hello messaging

  • Forwarding multicast packets

  • Joining the shared tree

  • Registering with the RP

  • Shortest-path tree (SPT) switching

  • Pruning interfaces

  • Assert messaging

  • Determining the RP

We will discuss each part in turn, assuming a stable system, meaning one where the RP has already been elected. The last section will show how this happens.

As a preliminary, we will first talk about:

  • Flooding and reverse path forwarding (RPF)

  • Shortest-path trees

  • Shared trees

Understanding these terms will make understanding the PIM-SM protocol easier.

Flooding and Reverse Path Forwarding

Flooding is a simple scheme for routing that doesn't depend on having any routing information. In flooding, a packet is transmitted on every interface except the one from which it was received. To limit the number of times a packet is replicated, a metric, such as a hop count, is used. When the metric reaches a given threshold, the packet is dropped.

The problem with flooding is that it creates an exponential number of copies of each packet. So, on the one hand, flooding guarantees that a copy of each packet will be delivered to each node, provided that packets aren't lost; on the other hand, flooding can generate so much congestion that packets very likely will be lost.

Reverse path forwarding (RPF) is a concept that was first introduced by Yogen Dalal.2 It is an optimized form of flooding, where the router accepts a packet from source S through interface I only if I is the interface the router would use in order to reach S. It determines whether the interface is correct by consulting its unicast routing tables. This technique dramatically decreases the overhead associated with standard flooding. Because a router accepts a packet from only one neighbor, it floods the packet only once, which means (assuming point-to-point links) each packet is transmitted over each link once in each direction. An example of RPF is shown in Figure 1 below.

Bb742462.pimsm201(en-us,TechNet.10).gif

Figure 1: Example of RPF

In this example, the router discards the packet that came from a source on network 172.16.0.0/20 through interface I0. This is because its routing table does not list this interface as the shortest path to network 172.16.0.0/20. If the router had a packet to forward to that network, it would use I1. The packet that arrives through interface I1 is forwarded because the routing table lists this interface as the shortest path to the network. Notice that the router's unicast routing table determines the shortest path for the multicast packets.

Shortest-Path Trees

Shortest-path trees (SPTs) are also called source-based trees, meaning that the forwarding paths are based on the shortest unicast path to the source. This is what we mean when we say that source trees are considered shortest-path trees from the perspective of the unicast routing tables. If the unicast routing metric is hop counts, then the branches of the multicast SPT are minimum hop. If the metric is delay, the branches are minimum delay.

For every multicast source, there is a corresponding multicast tree that directly connects the source to all receivers. Once the tree for a source and its associated group is constructed, all traffic to the members of the group passes along this tree. SPTs have an (S, G) entry with a list of outgoing interfaces, where S is the source address and G is the multicast group. Examples of other protocols that use SPTs are DVMRP and MOSPF, which are dense-mode protocols. Figure 2 below shows an example of an SPT.

Bb742462.pimsm202(en-us,TechNet.10).gif

Figure 2: Example of an SPT

In this example, the SPT for Source1 is through interface I0 on Router 1, even though there is an alternative path through the combination of Routers 1 and 3. The SPT for Source2 is through interface I3, even though, once again, there is an alternative but longer path. (In this example, the metric is hop counts).

Shared Trees

Shared trees are, for PIM-SM, called RP trees (RPT) since they rely on a central router called the rendezvous point (RP) that receives all the traffic from the sources and forwards that traffic to the receivers. Members send explicit joins to the central node, so there is no assumption that all hosts are receivers. The result is a single tree for each multicast group, no matter how many sources there are. The only routers that know about the group are the ones that are on the tree, and data is sent only to interested receivers. RPTs have a (*, G) entry (called a wildcard entry), where G is the multicast group. With an RP, receivers have a place to join to even if no sources exist yet.

The shared tree is unidirectional, which means data flows only from the RP to the receivers. For a host other than the RP to send on the tree, the data must first be tunneled to the RP before it can be multicast to the participants. This means that if a receiver is also a source, it can't use that tree to send packets to the RP. It can only use it to receive packets from the RP. (While this is true in general, there are unusual exceptions, such as when the source is located between the RP and the receivers, and is already on the tree. In this case, the data flows directly from the source to the receivers.)

RPTs have longer delays (packets must first be sent to the RP before they can be distributed), but have less router state to maintain. Some examples of applications where an RPT is appropriate include:

  • Networks with many low-rate data sources

  • Applications that can tolerate delay

  • Applications requiring consistent policy and access control across most participants in a group

  • Networks where most of the source trees overlap topologically with the shared tree

Conferencing applications can use both the SPTs and the RPT (recall that PIM-SM supports their simultaneous use). The RPT could be used for keep-alive packets, because these are sent at a low data rate. The sources could use the SPTs because they are sending out a great deal of data.

An example of a PIM-SM RPT is illustrated below in Figure 3.

Bb742462.pimsm203(en-us,TechNet.10).gif

Figure 3: Example of a shared tree

In this example, although the network configuration is the same as in the shortest-path example, the traffic flow is different. All multicast traffic from Source(1) flows through interface I(1) on Router 1 to Router 3, which is the RP; from Source(2), it flows to the RP through interface I(2) on Router 2. (The paths from the sources to this central point are the shortest-path routes.) The RP then distributes the data to receivers that have explicitly joined the multicast group, using a single tree to all receivers. Multiple RPs can exist in a network, but there should be only one RP for each multicast group.

Although each path from the RP to a receiver is the shortest path, the shortest path from the source to the receivers is not the same as the path from the receiver to the RP. A legitimate question is: Why not use the SPT rather than the RPT? For PIM-SM, there are two answers. First, PIM-SM has a method that allows the last-hop router (the one directly attached to the receivers) to leave the RPT and join the SPT if the volume of traffic warrants it. This is called an SPT switch, and is discussed later in the paper. Second, when the RPT is used, routers don't need to maintain as much state, which decreases the amount of memory required.

Examining the PIM-SM Protocol

We will now examine each of the parts of the PIM-SM protocol, beginning with neighbor discovery. As we stated earlier, we will assume that the RP has already been chosen. In the last section, we will show how this happens. Recall that, when discussing control messages such as Hello and Join/Prune, we are referring to PIM-SM version 2. Version 2 messages are encapsulated in IP packets with a protocol number set to 103. Version 1 messages were encapsulated in Internet Group Management Protocol (IGMP) packets. To see the format for IP encapsulation, or for any of the control messages, refer to the "Control Messages" section at the end of this paper.

Hello Messaging

PIM routers periodically send Hello messages to discover neighboring PIM routers. Hello messages are multicast using the address 224.0.0.13 (ALL-PIM-ROUTERS group). A Holdtime field specifies how long the information is valid. Routers don't send any acknowledgement that a Hello message was received. Also, unlike some other protocols, such as DVMRP, when a Hello message is received, its interface is not automatically added to the list of outgoing interfaces for forwarding multicast traffic. PIM-SM uses an explicit join model; a downstream receiver must join a group before traffic is forwarded on the interface.

Forwarding Multicast Packets

PIM-SIM routers forward multicast traffic onto all interfaces that lead to receivers that have explicitly joined a multicast group. (Receivers do this by sending an IGMP Host Membership Report for each group to which they belong. These messages are sent to the group address. They have an IP Time to Live (TTL) of 1 and are confined to the local subnetwork.) The router performs an RPF check before a packet is forwarded. The type of RPF check a router performs depends on whether the tree is an RPT or an SPT. If it is an RPT, the RPF check uses the IP address of the RP. If it is an SPT, the RPF check uses the address of the source. Both cases are illustrated below in Figure 4.

Bb742462.pimsm204(en-us,TechNet.10).gif

Figure 4: RPF checks on RPT and SPT

Multicast traffic from source 162.10.4.1 uses the RPT, meaning the source sends it to the RP rather than to the multicast group (the router would denote this by having a (*, G) entry rather than a (S, G) entry). Before sending this traffic, Router 1 checks its unicast routing table to see if packets from the RP are arriving on the correct interface. In this case they are, because they arrive on interface I1, and the packets are forwarded.

Multicast traffic from source 162.10.4.2 uses the SPT (the router would denote this by having an (S, G) entry for the source). In this case, the router uses the IP address of the source to perform the RPF check, looking at the unicast routing table to see if traffic from the source is arriving on the correct interface. In this case, it is, so the traffic is forwarded.

Traffic that arrives on the correct interface is sent onto all outgoing interfaces ("oif") that lead to downstream receivers if any one of the following conditions is true:

  • A downstream PIM-SM router has sent a join to this router.

  • There is a directly attached receiver that has explicitly joined the group by means of the IGMP protocol.

  • The interface has been manually configured to join the group.

There is an oif list for each (*, G) or (S, G) entry.

Joining the Shared Tree

As we said earlier, when a host wishes to join a multicast group, it sends an IGMP message to its upstream router, which means the router can begin to accept multicast traffic for that group. To do this, the router must signal the RP that it wishes to join the RPT. It does this by sending a PIM (*, G) Join message to its upstream PIM neighbor, in the direction of the RP. Join messages are multicast hop-by-hop to address 224.0.0.13, which is the ALL-PIM-ROUTERS group. This means that, on a multi-access network, all PIM neighbors are aware of the join, but only the indicated up-stream PIM neighbor performs the join. (The same message is used for both joins and prunes, which are discussed later in the paper.)

When a PIM router receives a (*, G) Join from a downstream router, it checks to see if (*, G) state exists for group G in its multicast routing table. If state already exists, then the Join message has reached the shared tree and the interface from which the message was received is entered in the oif list. If no state exists, a (*, G) entry is created, the interface is entered in the oif list, and the Join message is again sent towards the RP.

Once (*, G) state is created from the last-hop router to the RP, multicast traffic for G can reach the host that joined the group. This process is illustrated below in Figure 5.

Bb742462.pimsm205(en-us,TechNet.10).gif

Figure 5: Example of a PIM join

In this example, the receiver sends an IGMP Host Membership message to Router 2, signaling its intention to join multicast group G. This is the first receiver attached to Router 2 that has joined the group, so the router has no (*, G) state. It creates an entry, adding I4 to the oif list, and then forwards the Join message to its upstream neighbor, Router 1. Router 1 also has no (*, G) state so it repeats the process, again forwarding the Join message toward the RP. This continues either until the RP receives the Join message, or until an up-stream router that already has (*, G) state receives the message. The result in either case is that an RPT has been constructed that reaches all the way from the RP to the receiver. Notice that, until the receiver sends the IGMP message, nothing happens. The receiver initiates the join process.

The Designated Router

When multiple routers are connected to a multi-access network (for example, an Ethernet) one of them must be selected to act as the designated router (DR) for a given period of time. The DR is responsible for sending Join/Prune messages to the RP. To elect the DR, each PIM router on the network examines the received Hello messages and compares its IP address with those of its neighbors. The router with the highest address is the DR. Figure 6 below illustrates this process.

Figure 6: DR election

Figure 6: DR election

In this example, Router 2 becomes the DR because it has the higher IP address. If no Hello messages are received from the DR after some configurable period of time, another DR election is held, meaning that the router with the highest IP address becomes the DR.

Registering with the RP

Sources of multicast traffic don't necessarily join the group to which they are sending data. A first-hop router (the DR) can begin receiving traffic from a source without having any (S, G) state for that source. This means there is no information on how to get multicast traffic to the RP through a tree. When the source's DR receives the initial multicast packet, it encapsulates it in a Register message and unicasts it to the RP for that group. The RP de-encapsulates each Register message and forwards the extracted data packet to downstream members on the RPT. (Optionally, the RP may also send an (S, G) Join back to the DR in order to build the SPT back to the source. Typically, this occurs when a data-rate threshold is reached.)

Once the path is established from the source to the RP, the DR begins sending traffic to the RP as standard IP multicast packets as well as encapsulated within Register messages. This means that the RP will temporarily receive some packets twice. When the RP detects the normal multicast packets, it sends a Register-Stop message to Router 1, meaning it should stop sending register packets.

Figure 7 below shows how the register process works (assuming that the routers have no pre-existing state).

Bb742462.pimsm207(en-us,TechNet.10).gif

Figure 7: Example of register packets

This illustration shows a source that is beginning to transmit multicast data. The source's DR, Router 1, creates (S, G) state and unicasts the packet to the RP, encapsulated in a Register message. When the RP receives the packet, it extracts the multicast data from the Register message, and, if there are interested receivers, forwards it along the RPT.

Until the SPT is established, Router 1 will continue to unicast Register messages that contain the multicast data to the RP. Once the tree is built, Router 1 begins forwarding the same multicast traffic, sent as standard IP multicast packets, down the SPT. This means the RP temporarily receives the source's data by means of Register messages and through the SPT.

Register-Stop Messages

When the RP begins receiving traffic from the source both in Register messages and as unencapsulated IP packets, it sends a Register-Stop message to the DR. This notifies the DR that the traffic is now being received as standard IP multicast packets on the SPT. Once the DR receives this message, it stops encapsulating traffic in Register messages. Figure 8 below illustrates this process.

Bb742462.pimsm208(en-us,TechNet.10).gif

Figure 8: Example of a register-stop packet

The other case where the RP sends Register-Stop messages is when no receivers belong to the multicast group. A source can begin transmitting to a group that has no members. The RP discards these packets and sends a Register-Stop so that the DR will stop sending Register messages.

Interface Pruning

When an RP receives a Prune message, it no longer forwards traffic from the source indicated in the Prune message. Prunes originate with a leaf router (this is the router directly attached to the receivers), which we will assume is the DR. If the last member of a multicast group sends the DR an IGMP version 2 Leave message, (or, in IGMP version 1, simply times out), the DR's IGMP state is deleted and the interface is removed from both the (S, G) and (*, G) oif lists for group G. If every interface in the oif list of the (*, G) state is removed—meaning that the router has no receivers on any interface that are members of G–then a Prune message is sent upstream, through the shared tree, to the RP.

If the upstream routers also have null oif lists, the message continues to be forwarded to the RP. If a router still has (*, G) state for receivers on another interface, it will remove this interface as well unless it receives an overriding Join message from a PIM neighbor. (The same process of pruning branches applies if the SPT is being used instead of the RPT.) Figure 9 below shows an example of the pruning process.

Bb742462.pimsm209(en-us,TechNet.10).gif

Figure 9: Example of the pruning process

In this example, Router 2, which is the leaf router, receives an IGMP Leave message from the receiver. It processes the message and, because there are no other group members, it removes I4 from (*, G) as well as from any (S, G) oif lists. The router's oif list is now null, so it sends a (*, G) Prune message up the RPT, through I3, toward the RP.

The Prune message is received by Router 1, which causes the I3 interface to be removed from its oif list of the (*, G) entry in the multicast routing table. Note that a period of time elapses before the prune takes effect. For multi-access networks, it is important to wait because an overriding Join message may arrive from a PIM neighbor. In this case, none was received, so the interface is pruned.

Because the (*, G) oif list is now null, a (*, G) Prune is forwarded up the RPT toward the RP. This process continues either until the RP receives the message, or until a router is reached whose (*, G) oif list doesn't go to null as a result of the prune.

Assert Messaging

In multi-access networks, there may be parallel paths to either the source or the RP. This can lead to group members receiving duplicate packets from multiple routers. To avoid this problem, PIM-SM uses Assert messages to determine a designated forwarder. Figure 10 below illustrates such a situation.

Bb742462.pimsm210(en-us,TechNet.10).gif

Figure 10: Example of a network requiring an Assert

In this example, Router 1, which is the RP, forwards multicast traffic to its neighbors, Routers 2 and 3. These routers in turn forward the traffic onto the LAN. Assume Router 3 transmits first. Router 2 receives the multicast packet on an interface that has this group in the oif list. Router 2 then forwards the packet to Router 3, which means that Router 3 has also received data on an outgoing interface. Receiving an incoming packet on an outgoing interface alerts routers to the fact that other PIM-SM neighbors on the LAN are also forwarding traffic to the group. This means group members will receive duplicate data.

To avoid this situation, routers issue Assert messages to select a single router to forward traffic. Downstream routers listen to the Assert messages so that they know which one was elected and, therefore, where to send subsequent Join messages. In our example, Router 4 initially sends Join messages to Router 2 while Router 5 initially sends Join messages to Router 3. After the assert, all Join messages will go to either Router 2 or Router 3, depending on which becomes the designated forwarder.

Winning an Assert

If all the routers are running the same unicast protocol, the router with the best metric wins the Assert. For example, if all the routers are using RIP, the router with the smallest hop count is elected. If the metrics are equal, the router with the highest IP address is elected.

If the routers are running different unicast protocols, then the metrics can't be compared. For example, RIP uses a hop count as its metric while the OSPF metric is based on the speed of the interface. In this case, the metric preference value determines which router will forward traffic and which router will prune the interface. Metric preferences can be configured for each unicast protocol running in the network. When a router receives an Assert message for a group, the metric preference value in the packet is compared with its own. If they are equal, then the metrics can be compared to determine which router will forward traffic. If the metric preferences are different, then the one with the lowest metric preference is selected.

SPT Switching

A traffic threshold (specified in kilobits) can be configured on a last-hop router such that, when the threshold for a group is exceeded, the router switches from the RPT to the SPT. When this happens, the DR sends an (S, G) Join toward the source of the packet. This builds an SPT from the source, S, to the router. Switching to the SPT means that the shortest path is used to deliver the multicast traffic. Depending on the location of the source in relation to the RP, this switch can substantially reduce network latency. The drawback is that an increased amount of state must be kept in the routers.

To determine whether the switch should occur, the total aggregate rate of group traffic flowing down the RPT is calculated at a given periodic interval. Typically, if this rate is exceeded, the next packet received for that group causes the switch. (The actual details of what happens, and how often the aggregate rate is calculated, are implementation-dependent. They are not specified by the protocol.)

Determining the RP

PIM-SM version 1 had two possible methods for determining the RP. The first method was a static method. It required configuring each leaf router with the address of an RP for a group or set of groups. The second choice was dynamic and used a method called Auto-RP.

PIM-SM version 2 is different from version 1. It has a single method that uses a bootstrap router (BSR) to originate Bootstrap messages. These messages are used to elect a BSR, if necessary, and to disseminate RP information. The messages are multicast to the ALL-PIM-ROUTERS group on each link.

One or more routers are configured to be candidate BSRs. If it is not apparent which router should be the BSR, the candidates flood the domain with advertisements (using RPF to make the flood less costly). The router with the highest priority is elected. If all the priorities are equal, then the candidate with the highest IP address becomes the BSR. (A domain, in this context, is a contiguous set of routers that all implement PIM-SM version 2 and are configured to operate within a common boundary defined by PIM multicast border routers (PMBRs). In brief, PMBRs connect each PIM domain to the rest of the Internet. For more information, consult RFC 2362.)

Routers that are configured to be candidate RPs unicast this information to the BSR. (It is very common for routers that are configured to be candidate BSRs to also be configured to be RPs.) The candidate RP advertisement contains the address of the advertising router as well as the multicast groups it can service.

The BSR includes a set of these candidate RPs (the RP-Set), along with the corresponding group addresses, in Bootstrap messages it periodically originates. Bootstrap messages are distributed hop-by-hop throughout the domain.

Routers receive and store Bootstrap messages originated by the BSR. When a DR gets a membership indication from IGMP for (or a data packet from) a directly connected host, for a group for which it has no entry, the DR uses a hash function to map the group address to one of the candidate RPs that can service that group. The DR then sends a Join/Prune message towards (or unicasts Register messages to) that RP.

Unresolved Issues

Although its official status is "experimental," PIM-SM version 2 is currently considered the de facto standard multicast routing protocol. However, some issues are still not resolved. As we said at the outset, because it is a router-to-router protocol, all routers in the network must be upgraded to support it. A second issue is that, because the number of candidate RPs scales linearly with the size of the domain, the protocol cannot scale globally. Mapping G to RP may also present a scaling problem because it involves flooding the BSR announcements. Another problem is the location of the RP. There are many ISPs in the world, and none of them want to depend on an RP in the domain of another ISP for multicast service between its own customers.

PIM-SM Control Messages

This section describes the formats for PIM-SM encoded address fields and control messages. It also shows how the control messages are encapsulated within IP packets, as well as the standard header for all PIM messages. A more complete description of these messages can be found in RFC 2362.

The following formats are treated here:

  • PIM-SM control-message encapsulation

  • PIM-SM packet header

  • Encoded Unicast Address field

  • Encoded Group Address field

  • Encoded Source Address field

  • Assert message

  • Bootstrap message

  • Candidate RP advertisement

  • Hello message

  • Join/Prune message

  • Register message

  • Register-Stop message

PIM Control-Message Encapsulation

Figure 11 below shows how PIM-SM control messages are contained within an IP packet.

Bb742462.pimsm211(en-us,TechNet.10).gif

Figure 11: Encapsulated PIM control message

PIM-SM version 2 messages are encapsulated in IP packets with a protocol number set to 103.

PIM-SM Packet Header

The header for a PIM-SM version 2 packet is shown in Figure 12 below.

Bb742462.pimsm212(en-us,TechNet.10).gif

Figure 12: PIM-SM version 2 packet header

The fields in the header have the following values:

  • Ver is the PIM version number. For version 2, the value is 2.

  • Type is the value associated with the particular control message(see Table 1 below).

  • Reserved is transmitted as 0, It is ignored upon receipt.

  • Checksum is the 16-bit one's complement of the one's complement sum of the entire PIM message (excluding the data portion in the Register message).

Each kind of control message has a different Type value, which is listed in Table 1 below.

Table 1 PIM-SM version 2 message types

Type

Description

0

Hello

1

Register

2

Register-Stop

3

Join/Prune

4

Bootstrap

5

Assert

8

Candidate RP advertisement

Encoded Unicast Address

The Encoded Unicast Address field has the format shown in Figure 13 below.

Bb742462.pimsm213(en-us,TechNet.10).gif

Figure 13: Format of the Encoded Unicast Address field

The subfields have the following values:

  • Address Family is the family to which the unicast address belongs. The Address Family numbers and their associated values are listed in Table 2 below.

  • Encoding Type is the type of encoding used within a specific address family. The value 0 is reserved for this field and represents the native encoding of the address family.

  • Unicast Address is the unicast address as specified by the Address Family and Encoding Type fields.

Table 2 below shows the address family numbers for IP versions 4 and 6. Although other numbers are assigned, they are rarely used. You can find a complete list, as assigned by the Internet Corporation for Assigned Names and Numbers (ICANN), at https://www.icann.org/.

Table 2 Address numbers for IP versions 4 and 6

Number

Description

0

Reserved

1

IP version 4

2

IP version 6

Encoded Group Address

The Encoded Group Address field has the format shown in Figure 14 below.

Bb742462.pimsm214(en-us,TechNet.10).gif

Figure 14: Format of the Encoded Group Address field

The subfields have the following values:

  • Address Family and Encoding Type: See definitions provided in the "Encoded Unicast Address" section above.

  • Reserved is transmitted as 0. It is ignored upon receipt.

  • Mask Length is the number of left-justified, contiguous bits used as a mask to describe the address. The mask length must be less than or equal to the address length in bits for the given address family and encoding type. In PIM-SM version 2, it is recommended that this field be set to 32 for IP version 4 native encoding.

  • Group Multicast Address is the address of the multicast group.

Encoded Source Address

The Encoded Source Address field has the format shown in Figure 15 below.

Bb742462.pimsm215(en-us,TechNet.10).gif

Figure 15: Format of Encoded Source Address field

The subfields have the following values:

  • Address Family, Encoding Type, and Reserved: See definitions provided in the "Encoded Unicast Address" and "Encoded Group Address" sections above.

  • The S field is the Sparse bit. It is set to 1 for PIM-SM. It is used for compatibility with PIM version 1.

  • The WC bit is the wildcard bit. If set to 1, the join or prune applies to the (*, G) or (*, *, RP) entry. If the value is 0, the join or prune applies to the (S, G) entry, where S is the source address. Joins or prunes sent to the RP must set this bit to 1. (A data packet will match on a (*, *, RP) entry if there is no more specific entry [such as (S, G) or (*, G)] and the destination group address in the packet maps to the RP listed in the (*, *, RP) entry. For more information about this special entry, and how it relates to interoperability between PIM-SM and dense-mode protocols, consult RFC 2362.)

  • The R bit is the RPT bit. If set to 1, the information about (S, G) is sent to the RP. If 0, the information is sent to S, where S is the source address.

  • Mask Length: See definition provided in the "Encoded Group Address" section above.

Assert Message

The Assert message has the format shown in Figure 16 below.

Bb742462.pimsm216(en-us,TechNet.10).gif

Figure 16: Format of the Assert message

The fields have the following values:

  • Version, Type, Reserved, and Checksum: See definitions provided in the "PIM-SM Packet Header" section above.

  • Encoded Group Address is the group address to which the packet was addressed and that triggered the Assert. The format is defined above in the "Encoded Group Address" section.

  • Encoded Unicast Source Address is the source address contained in the multicast packet that triggered the Assert. The format is defined above in the "Encoded Unicast Address" section.

  • R is the RPT-bit. It is a 1-bit value. If the multicast packet that triggered the Assert is routed down the RPT tree, then the RPT-bit equals 1. If the packet is routed down the SPT tree, then the RPT-bit equals 0.

  • Metric Preference is the preference value associated with the unicast routing protocol that provided the route to the host address.

  • Metric is the unicast routing table metric. It is in units appropriate for the unicast routing protocol.

Bootstrap Message

Bootstrap messages are divided up into semantic fragments if the original message exceeds the maximum packet size. The format of a single fragment is given below in Figure 17.

Bb742462.pimsm217(en-us,TechNet.10).gif

Figure 17: Bootstrap message format

The fields have the following values:

  • Version, Type, Reserved, and Checksum: See definitions provided above in the "PIM-SM Packet Header" section.

  • Fragment Tag is a randomly generated number that is used to distinguish between the fragments belonging to different Bootstrap messages. Fragments belonging to the same Bootstrap message carry the same fragment tag.

  • Hash Mask Length is the length (in bits) of the mask to use in the hash function. A value of 30 is recommended for IP version 4. A value of 126 is recommended for IP version 6.

  • BSR Priority is the BSR priority value of the included BSR. This field is considered a high order byte when comparing BSR addresses.

  • Encoded Unicast BSR Address is the address of the bootstrap router for the domain. It is the same format as the Encoded Unicast Address field format.

  • Encoded Group Address is the group prefix (address and mask) with which the candidate RPs are associated. The format is discussed above in the "Encoded Group Address" section.

  • RP Count 1…n is the number of candidate RP addresses included in the whole Bootstrap message for the corresponding group prefix.

  • Fragment RP Count 1…m is the number of candidate RP addresses included in this fragment of the Bootstrap message, for the corresponding group prefix.

  • Encoded Unicast RP Address 1…m is the address of the candidate RPs, for the corresponding group prefix. The format is defined above in the "Encoded Unicast Address" section.

  • RP1…m Holdtime is the hold time for the corresponding RP. This field is copied from the Holdtime field of the associated RP stored at the BSR.

  • RP1…m Priority is the priority of the corresponding RP and Encoded Group Address. This field is copied from the Priority field stored at the BSR when receiving a candidate RP advertisement. The highest priority is 0 (the lower the value of the Priority field, the higher the priority). Note that the priority is per RP per encoded group address.

Candidate RP Message

The Candidate RP message has the format shown in Figure 18 below.

Bb742462.pimsm218(en-us,TechNet.10).gif

Figure 18: Candidate RP message format

The fields have the following values:

  • Version, Type, Reserved, and Checksum: See definitions provided above in the "PIM-SM Packet Header" section.

  • Prefix Count is the number of encoded group addresses included in the message. A value of 0 means all multicast groups are included.

  • Priority is the priority of the candidate RP for the corresponding group address. The highest priority is 0. In other words, a lower value means a higher priority. The BSR stores this value, along with the RP address and corresponding encoded group address.

  • Holdtime is the amount of time for which the advertisement is valid.

  • Encoded Unicast RP Address is the address of the interface to advertise as the candidate RP.

  • Encoded Group Address 1…n is the group addresses for which the candidate RP is advertising.

Hello Message

The Hello message has the format shown below in Figure 19.

Bb742462.pimsm219(en-us,TechNet.10).gif

Figure 19: Hello message format

The fields have the following values:

  • Ver, Type, Reserved, and Checksum: See definitions provided above in the "PIM-SM Packet Header" section.

  • Option Type is the type of the option given in the Option Value field.

  • Option Length is the length of the Option Value field in bytes.

  • Option Value is a variable-length field, carrying the value of the option.

Table 3 below shows the values of the Option fields.

Table 3 Values of the Option fields

Option Type

Option Length

Option Value

1

2

Holdtime

2-16

Reserved

Reserved

The values for the hold-time parameter are shown in Table 4 below.

Table 4 Holdtime parameter values

Value

Description

0xFFFF

No time out

0

Immediate time out

Any other value

Neighbor time out value

A timeout value of 0xFFFF means that the neighbor never expires. This value prevents the sending of periodic Hello messages. It is useful for tariff connections, such as those provided by ISDN. Periodic Hello messages would keep the link active, even though there was no user data traffic, and thus result in the continued billing of the customer.

Join/Prune Message

Join/Prune messages have the format shown in Figure 20 below.

Bb742462.pimsm220(en-us,TechNet.10).gif

Figure 20: Format of Join/Prune messages

The fields have the following values:

  • Version, Type, Reserved, and Checksum: See definitions provided above in the "PIM-SM Packet Header" section.

  • Encoded Unicast Upstream Neighbor Address is the address of the upstream neighbor (through the RPF interface). The format is defined above in the "Encoded Unicast Address" section.

  • Holdtime is the amount of time in seconds the receiver must keep the Join/Prune state alive. If the Holdtime is set to 0xFFFF, the receiver of this message never times out the oif. This value can be used with ISDN lines, to avoid keeping the link up with periodic Join/Prune messages. If the Holdtime is set to 0, the information is timed out immediately.

  • Number of Groups is the number of multicast group sets contained in the message.

  • Encoded Multicast Group Address is the multicast group address. The format is defined above in the "Encoded Group Address" section.

  • Number of Joined Sources is the number of join-source addresses listed for a given group.

  • Encoded Join Source Address 1...n is the list of sources for which the sending router will forward multicast packets if they are received on the correct interface. The format is described above in the "Encoded Source Address" section.

  • Number of Pruned Sources is the number of prune-source addresses listed for a group.

  • Encoded Prune Source Address 1...n is the list containing the sources to be pruned..

Register Message

Register messages have the format shown in Figure 21 below.

Bb742462.pimsm221(en-us,TechNet.10).gif

Figure 21: Register message format

The fields have the following values:

  • Ver, Type, Reserved, and Checksum: See definitions provided above in the "PIM-SM Packet Header" section.

  • B is the Border bit. If the router is a DR for a source that it is directly connected to, it sets the B bit to 0. If the router is a PMBR for a source in a directly connected cloud, it sets the B bit to 1.

  • N is the Null-Register bit. A DR that is probing the RP before expiring its local Register-Suppression timer sets it to 1. Otherwise, it is set to 0.

  • Multicast Data Packet is the original packet sent by the source.

Register-Stop Message

Register-Stop messages have the format shown in Figure 22 below.

Bb742462.pimsm222(en-us,TechNet.10).gif

Figure 22: Register-Stop message format

The fields have the following values:

  • Ver, Type, Reserved, and Checksum: See definitions provided above in the "PIM-SM Packet Header" section.

  • Encoded Group Address: See the "Encoded Group Address" section above for discussion.

  • Encoded Unicast Source Address: The source address included in the multicast packet that was encapsulated in the Register message. The Encoded Unicast Source Address uses the same format as the Encoded Source Address (for which see above).

Summary

Currently, PIM-SM is the de facto standard multicast routing protocol. It is designed to perform efficiently in WANs, where multicast groups are sparsely distributed. It maintains the traditional IP multicast service model of receiver-initiated membership and supports both shared and shortest-path trees. PIM-SM is not dependent on a specific unicast routing protocol. Because it is a router-to-router protocol, all routers in the network must be upgraded to support PIM-SIM version 2.

For More Information

For the latest information on Windows® 2000 Server, please consult our Web site at https://www.microsoft.com/windows2000.

12/99

1 S. Deering, D. Estrin, D. Farinacci, V. Jacobsen, C. Liu, and L.Wei, "The PIM Architecture for Wide-Area Multicast Routing," IEEE/ACM Transactions on Networking 4.2 (April 1996): 153.
2 Yogen Dalal, "Broadcast Protocols in Packet Switched Computer Networks" (Digital Systems Laboratory, Dept. of Electrical Engineering, Stanford University, 1977).