INTERNET DRAFT Venu Hemige Alcatel-Lucent Internet Engineering Task Force Yetik Serbest Document: AT&T draft-ietf-l2vpn-vpls-pim-snooping-01.txt Ray Qiu Suresh Boddapati Alcatel-Lucent March 2007 Category: Informational Expires: September 2007 PIM Snooping over VPLS Status of this memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. This document is an Internet-Draft and is in full conformance with Sections 5 and 6 of RFC 3667 and Section 5 of RFC 3668. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract In Virtual Private LAN Service (VPLS), as also in IEEE Bridged Networks, the switches simply flood multicast traffic on all ports in the LAN by default. IGMP Snooping is commonly deployed to ensure multicast traffic is not forwarded on ports without IGMP receivers. The procedures and recommendations for IGMP Snooping are defined in [IGMP-SNOOP]. But when any protocol other than IGMP is used, the common practice is to simply flood multicast traffic to all ports. PIM-SM, PIM-SSM, PIM-BIDIR are widely deployed routing protocols. PIM Snooping procedures are important to restrict multicast traffic to only the routers interested in receiving such traffic. [Page 1] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 While most of the PIM Snooping procedures defined here also apply to IEEE Bridged Networks, VPLS demands certain special procedures due to the split-horizon rules that require the Provider Edge (PE) devices to co-operate. This document describes the procedures and recommendations for PIM-Snooping in VPLS to facilitate replication to only those ports behind which there are interested PIM routers and/or IGMP hosts. This document also describes procedures for PIM Proxy. PIM Proxy is required on PEs for VPLS Multicast to work correctly when Join suppression is enabled in the VPLS. PIM Proxy also helps scale VPLS Multicast much better than just PIM Snooping. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC 2119]. Table of Contents 1. Introduction .............................................3 1.1. Assumptions...............................................4 1.2. PIM Snooping and PIM Proxy Complexity.....................4 1.3. Definitions...............................................5 2. Multicast Traffic over VPLS...............................6 2.1. Constraining of IP Multicast in a VPLS....................6 2.2. IPv6 Considerations.......................................7 2.3. PIM-SM (*,*,RP) Considerations............................7 2.4. PIM Packet Types to Snoop.................................8 2.5. PIM Snooping vs PIM Proxy.................................8 2.5.1. Differences between PIM Snooping and PIM Proxy............9 2.5.2. PIM Control Message Latency...............................9 2.5.3. When to Snoop and When to Proxy......................... 10 3. PIM Snooping for VPLS................................... 10 3.1. General Rules for PIM Snooping in VPLS.................. 11 3.1.1. Snooping PIM Packets ................................... 11 3.2. Discovering PIM Routers................................. 11 3.3. PIM-SM and PIM-SSM...................................... 12 3.3.1. Building PIM-SM Snooping States......................... 13 3.3.2. Explanation for per (S,G,N) states...................... 15 3.3.3. Receiving (*,G) PIM-SM Join/Prune Messages.............. 15 3.3.4. Receiving (S,G) PIM-SM Join/Prune Messages.............. 18 3.3.5. Receiving (S,G,rpt) Join/Prune Messages................. 19 3.3.6. Sending (*,G) Join/Prune Messages....................... 19 3.3.7. Sending (S,G) Join/Prune Messages....................... 20 3.3.8. Sending PIM Join/Prune message upstream................. 20 [Page 2] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 3.3.8.1. Sending Triggered vs Refresh Join/Prune messages....... 20 3.3.9. Triggering ASSERT Election in PIM-SM.................... 21 3.4. Bidirectional-PIM (PIM-BIDIR)........................... 21 3.4.1. Building PIM-BIDIR Snooping States...................... 22 3.5. PIM-DM.................................................. 22 3.5.1. Building PIM-DM Snooping States......................... 23 3.5.2. PIM-DM Downstream Per-Port PIM(S,G,N) State Machine .... 23 3.5.3. Triggering ASSERT election in PIM-DM.................... 23 3.6. PIM Proxy............................................... 24 3.6.1. Downstream PIM Proxy behavior........................... 24 3.6.2. Upstream PIM Proxy behavior............................. 25 3.6.3. Source IP Address in Proxy PIM Join/Prune Packets....... 25 3.7. Directly Connected Multicast Source .................... 25 3.8. Data Forwarding Rules................................... 26 3.8.1. PIM-SM Data Forwarding Rules ........................... 26 3.8.2. PIM-BIDIR Data Forwarding Rules......................... 27 3.8.3. PIM-DM Data Forwarding Rules............................ 28 4. IANA Considerations..................................... 28 5. Security Considerations................................. 29 6. Acknowledgements........................................ 29 7. References.............................................. 29 7.1. Normative References ................................... 29 7.2. Informative References.................................. 29 Appendix A. Example Network Scenario............................ 31 Appendix A.1 PIM-Snooping Example............................... 31 Appendix A.2 PIM Proxy Example with (S,G) / (*,G) interaction... 33 1. Introduction In Virtual Private LAN Service (VPLS), the Provider Edge (PE) devices provide a logical interconnect such that Customer Edge (CE) devices belonging to a specific VPLS instance appear to be connected by a single LAN. Forwarding information base for particular VPLS instance is populated dynamically by source MAC address learning. This is a straightforward solution to support unicast traffic, with reasonable flooding for unicast unknown traffic. Since a VPLS provides LAN emulation for IEEE bridges as wells as for routers, the unicast and multicast traffic need to follow the same path for layer-2 protocols to work properly. As such, multicast traffic is treated as broadcast traffic and is flooded to every site in the VPLS instance. VPLS solutions (i.e., [VPLS-LDP] and [VPLS-BGP]) perform replication for multicast traffic at the ingress PE devices. As stated in the VPLS Multicast Requirements draft [VPLS-MCAST-REQ], there are two issues with VPLS Multicast today: A. Multicast traffic is replicated to non-member sites. B. Replication of PWs on shared physical path. [Page 3] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 This document solves Issue A of [VPLS-MCAST-REQ] by ensuring that IP multicast traffic is not forwarded to non-member sites. Issue B is outside the scope of this document. The different mechanisms to tunnel IP multicast traffic in a VPLS from the ingress PE to the egress PEs are discussed in [VPLS-MCAST-TREES]. The solution in this document when combined with the solutions proposed in the working group to solve Issue B will provide a complete VPLS Multicast solution set. Using IGMP/PIM Snooping in VPLS has the following advantages: - It improves IP Multicast bandwidth usage in the VPLS core by ensuring traffic is replicated only to PEs with member sites. Note that this is not necessarily optimum, as there can still be bandwidth waste if traffic from a PE to other PE(s) is not forwarded along a minimum cost spanning tree. - It prevents sending multicast traffic to non-member sites. Procedures for IGMP Snooping are specified in [IGMP-SNOOP]. This document describes the procedures for Protocol Independent Multicast (PIM) snooping over VPLS for efficient distribution of IP multicast traffic. It also describes the rules when both IGMP and PIM are active in a VPLS instance. This document also describes procedures for PIM Proxy. PIM Proxy is required on PEs for VPLS Multicast to work correctly when Join suppression is enabled in the VPLS. PIM Proxy also helps scale VPLS Multicast much better than just PIM Snooping. 1.1. Assumptions Since this draft describes the procedures for PIM Snooping and PIM Proxy, the draft assumes that the reader has a good understanding of the PIM protocols. The text in this draft is written in the same style as the PIM RFCs to help correlate the concepts and to make it easier to follow. In order to avoid replicating the text relating to PIM protocol handling here, this draft assumes that the user will infer such detail from the PIM RFC referenced in this document. Deviations in protocol handling specific to PIM Snooping and PIM Proxy are specified in this draft. There could be cross references into definitions of macros and procedures from the PIM RFCs. 1.2. PIM Snooping and PIM Proxy Complexity The PIM Snooping and PIM Proxy solutions described here requires a switch to examine and operate on only PIM Hello and PIM Join/Prune packets. The switch does not need to examine any other PIM packets. [Page 4] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 The switch does not need to have any routing tables like is required in PIM Multicast Routing. It knows how to forward Join/Prunes by looking at the Upstream Neighbor field in the Join/Prune packets. The switch does not need to know about Rendezvous Points (RP) and does not have to maintain any RP Set. All that is transparent to a PIM Snooping switch. Most of the procedures in PIM Snooping and PIM Proxy in the handling of PIM Hellos and PIM Join/Prune packets are very similar to that of a PIM Router. The solutions described here provide complete separation of control and data planes. A PIM Proxy solution minimizes the control plane messages received at CE routers by proxying one message upstream on behalf of a large number of downstream CEs. As such control plane messaging is very similar to that of a PIM Router. 1.3. Definitions There are several definitions referenced in this document that are well described in the PIM RFCs [PIM-SM, PIM-BIDIR, PIM-DM]. The following definitions and abbreviations are used throughout this document: - A port is defined as either an attachment circuit (AC) or a Pseudo-Wire (PW). - When we say a PIM message is 'received' on a port, it means any one of the following: o that a PIM Snooping switch snooped the PIM message. o that a PIM message was received via LDP on a PW if LDP (as defined in [VPLS-MCAST-LDP]) is used for propogating multicast states among the PEs. Abbreviations used in the document: - S: IP Address of the Multicast Source. - G: IP Address of the Multicast Group. - N: Upstream Neighbor field in a Join/Prune/Graft message. - Rport(N): Port on which neighbor N is learnt Other definitions are explained in the sections where they are introduced. [Page 5] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 2. Multicast Traffic over VPLS In VPLS, if a PE receives a frame from an Attachment Circuit (AC) with no matching entry in the forwarding information base for that particular VPLS instance, it floods the frame to all other PEs (which are part of this VPLS instance) and to directly connected ACs (other than the one that the frame is received from). The flooding of a frame occurs when: - The destination MAC address has not been learned, - The destination MAC address is a broadcast address, - The destination MAC address is a multicast address. Malicious attacks (e.g., receiving unknown frames constantly) aside, the first situation is handled by VPLS solutions as long as destination MAC address can be learned. After that point on, the frames will not be flooded. A PE is REQUIRED to have safeguards, such as unknown unicast limiting and MAC table limiting, against malicious unknown unicast attacks. There is no way around flooding broadcast frames. To prevent runaway broadcast traffic from adversely affecting the VPLS service and the SP network, a PE is REQUIRED to have tools to rate limit the broadcast traffic as well. Similar to broadcast frames, multicast frames are flooded as well, as a PE cannot know where multicast members reside. Rate limiting multicast traffic, while possible, should be should be done carefully since several network control protocols relies on multicast. For one thing, layer-2 and layer-3 protocols utilize multicast for their operation. For instance, Bridge Protocol Data Units (BPDUs) use an IEEE assigned all bridges multicast MAC address, and OSPF is multicast to all OSPF routers multicast MAC address. If the rate- limiting of multicast traffic is not done properly, the customer network will experience instability and poor performance. For the other, it is not straightforward to determine the right rate limiting parameters for multicast. A VPLS solution MUST NOT affect the operation of customer layer-2 protocols (e.g., BPDUs). Additionally, a VPLS solution MUST NOT affect the operation of layer-3 protocols. In the following section, we describe procedures to constrain the flooding of IP multicast traffic in a VPLS. 2.1. Constraining of IP Multicast in a VPLS For a PE in a VPLS (a layer-2 device) to constrain IP multicast traffic, it needs to be able to learn which CEs are interested in receiving multicast traffic for what flows. The most obvious solution is to snoop IP multicast control traffic at the PEs. Snooping as a solution to constrain multicast traffic makes sense under the following circumstances: [Page 6] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 - The CE-CE protocol the PEs snoop is a popular and widely deployed protocol. - It does not require any changes on the CEs and it should be completely transparent to the CEs. IGMP/MLD and PIM are the popular IP Multicast Routing protocols today. Other routing protocols such as DVMRP or MOSPF are outside the scope of this document. This document describes the guidelines for PIM Snooping and PIM Proxy in VPLS. The specifications in this document could be used for either PIM Snooping or PIM Proxy. The PIM Proxy solution is described in section 3.6 Differences that need to be observed while implementing one or the other and recommendations on which method to employ in different scenarios are noted in section 2.5We will largely refer to PIM "Snooping" in this document. Unless specifically specified, the same procedures should apply to a Proxy solution as well. In the following sub-sections, we provide some guidelines for the implementation of PIM snooping in VPLS. Snooping techniques need to be employed on ACs at the downstream PEs. Snooping techniques can also be employed on PWs at the upstream PEs. This may work well for small to medium scale deployments. However, if there are a large number of VPLS instances with a large number of PEs per instances, then the amount of snooping required at the upstream PEs can overwhelm the upstream PEs. In [VPLS-MCAST-LDP] and [VPLS-MCAST-BGP], procedures are defined to exchange multicast membership information between the PEs using LDP or BGP. Using a reliable mechanism like LDP or BGP allows the upstream PEs to eliminate the requirement to snoop on PWs. It also eliminates the need to refresh multicast states on the upstream PEs. 2.2. IPv6 Considerations In VPLS, PEs forward Ethernet frames received from CEs and as such are agnostic of the layer-3 protocol used by the CEs. However, as an IGMP and PIM snooping switch, the PE would have to look deeper into the IP and IGMP/PIM packets and build snooping state based on that. The PIM Protocol specifications handle both IPv4 and IPv6. The specification for PIM Snooping in this draft can be applied to both IPv4 and IPv6 payloads. 2.3. PIM-SM (*,*,RP) Considerations This draft does not address (*,*,RP) states in the VPLS network. Although [PIM-SM] specifies that routers MUST support (*,*,RP) states, there are very few implementations that actually support it in actual deployments. Given the complexity of supporting (*,*,RP) states and knowing that there is little to no use to supporting it, this draft omits the specification relating to (*,*,RP) support. [Page 7] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 2.4. PIM Packet Types to Snoop A PIM Snooping switch need only snoop on PIM Hellos and PIM Join/Prune packets. All other PIM packets can be transparently flooded unexamined. 2.5. PIM Snooping vs PIM Proxy PIM Snooping switches simply snoop on PIM packets as they are being forwarded in the VPLS. As such it truly provides transparent LAN services since no customer packets are modified or consumed or new packets introduced in the VPLS. It is also slightly simpler to implement than PIM Proxy. However for PIM Snooping to work correctly, it is a requirement that CE routers MUST disable Join suppression in the VPLS. Given that a large number of existing CE deployments do not support disabling of Join suppression and given the operational complexity for a provider to manage disabling of Join suppression in the VPLS, it becomes a difficult solution to deploy. Another disadvantage of PIM Snooping as a solution is that it does not scale as well as PIM Proxy. If there are a large number of CEs in a VPLS, then every CE will see every other CE's Join/Prune messages. PIM Proxy on the PEs has the advantage that it does not require Join suppression to be disabled in the VPLS. Multicast as a VPLS service can be very easily be provided without requiring any changes on the CE routers. It also helps scale VPLS Multicast very well since the PEs intelligently forward only one Join/Prune message for a given flow and only to the upstream CE. PIM Proxy as a solution however loses the transparency argument since Join/Prunes could get modified or even consumed at a PE. Also, new packets could get introduced in the VPLS. However, this loss of transparency is limited to PIM Join/Prune packets. It is in the interest of optimizing multicast in the VPLS and helping a VPLS network scale much better. Data traffic will still be completely transparent. Both PIM Snooping and PIM Proxy procedures can be used in conjunction with [VPLS-MCAST-LDP] for propogating multicast states among the PEs. If [VPLS-MCAST-LDP] is used for propogating multicast states among the PEs, then both PIM Snooping and PIM Proxy switches do not process any PIM packets arriving on a PW. [Page 8] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 2.5.1. Differences between PIM Snooping and PIM Proxy For PIM-SM and PIM-BIDIR, a PIM Snooping/Proxy Switch only needs to examine PIM Hello and Join/Prune messages. PIM Proxy for PIM-DM is for future study and is not currently specified in this draft. The proxy proposal is to perform proxy of only the Join/Prune messages. PIM Hello messages are snooped by both PIM Snooping and PIM Proxy switches. Details on the PIM Proxy solution are discussed in section 3.6 This section is presented here to say that most of the procedures to follow (unless explicitly specified) are common to both PIM Snooping and PIM Proxy. Differences between a PIM Snooping switch and a PIM Proxy switch can be summarized as the following: +------------------------------|--------------------------------+ | PIM Snooping | PIM Proxy | +==============================|================================+ | 1. PIM Snooping switches | 1. PIM Proxy switches also | | snoop Hello and Join/Prune| snoop PIM Hello messages | | messages while they are | while they are transparently| | transparently flooded in | flooded in the VPLS. But | | the VPLS. | they consume PIM Join/Prune | | | messages and do not flood | | | them as is in the VPLS. | +------------------------------|--------------------------------+ | 2. PIM Snooping switches do | 2. PIM Proxy switches may | | not originate any PIM | originate new or modified | | packets. They may however | PIM Hello and Join/Prune | | originate PIM messages to | packets. | | be sent via LDP on PWs. | | +------------------------------|--------------------------------+ Other than the above simple differences, most of the procedures are common to PIM Snooping and PIM Proxy. There are additional simplifications to PIM Snooping that can be made if [VPLS-MCAST-LDP] is not used for PE-PE communication, but otherwise the procedures for PIM Snooping and PIM Proxy are mostly the same. In the text to follow, we describe the procedures for PIM "Snooping". Unless specifically stated otherwise, such procedures apply to PIM Proxy as well. 2.5.2. PIM Control Message Latency A PIM Snooping or PIM Proxy switch snoops on PIM Hello packets while transparently flooding it in the VPLS. As such there is no latency [Page 9] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 introduced by the VPLS in the delivery of PIM Hello packets to remote CEs in the VPLS. A PIM Proxy switch consumes PIM Join/Prune packets and generates proxy Join/Prune packets to be sent upstream. This can result in additional latency for a downstream CE to receive multicast traffic after it has sent a Join. When a downstream CE prunes a multicast stream, the traffic should stop flowing to the CE with no additional latency introduced by the VPLS. A PIM Snooping switch snoops on PIM Join/Prune packets while transparently flooding them in the VPLS. There is no latency introduced by the VPLS in the delivery of PIM Join/Prune packets when PIM Snooping is employed. 2.5.3. When to Snoop and When to Proxy Explicit Tracking (ET) is enabled in a VPLS when all PIM CE Routers in the VPLS advertise Tracking Support in their PIM Hello messages. If even one does not advertise Tracking Support, then all PIM CE routers disable ET in the VPLS. When ET is enabled, it implies that Join Suppression is disabled and vice versa. PIM Snooping PEs can determine if ET is enabled or disabled in a VPLS by examining PIM Hellos. If ET is disabled, PIM Proxy MUST be used. If ET is enabled, PIM Snooping SHOULD be used. 3. PIM Snooping for VPLS IGMP snooping procedures described in [IGMP-SNOOP] provide efficient delivery of IP multicast traffic in a given VPLS service when end stations are connected to the VPLS. However, when VPLS is offered as a WAN service it is likely that the CE devices are routers and would run PIM between them. To provide efficient IP multicasting in such cases, it is necessary that the PE routers offering the VPLS service do PIM snooping. PIM is a multicast routing protocol, which runs exclusively between routers. PIM shares many of the common characteristics of a routing protocol, such as discovery messages (e.g., neighbor discovery using Hello messages), topology information (e.g., multicast tree), and error detection and notification (e.g., dead timer and designated router election). On the other hand, PIM does not participate in any kind of exchange of databases, as it uses the unicast routing table to provide reverse path information for building multicast trees. There are a few variants of PIM. In PIM-DM ([PIM-DM]), multicast data is pushed towards the members similar to broadcast mechanism. PIM-DM constructs a separate delivery tree for each multicast group. As opposed to PIM-DM, other PIM flavors (PIM-SM [PIM-SM], PIM-SSM [PIM-SSM], and PIM-BIDIR [PIM-BIDIR]) invoke a pull methodology instead of push technique. [Page 10] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 PIM routers periodically exchange Hello messages to discover and maintain stateful sessions with neighbors. After neighbors are discovered, PIM routers can signal their intentions to join or prune specific multicast groups. This is accomplished by having downstream routers send an explicit Join/Prune message (for the sake of generalization, consider Graft messages for PIM-DM as Join messages) to the upstream routers. The Join/Prune message can be group specific (*,G) or group and source specific (S,G). In PIM snooping, a PE snoops on the PIM message exchanged between routers, and builds its multicast states. Based on the multicast states, it forwards IP multicast traffic accordingly to avoid unnecessary flooding. In the following sub-sections, snooping mechanisms for each variety of PIM are specified. 3.1. General Rules for PIM Snooping in VPLS The following rules for the correct operation of PIM snooping MUST be followed. - PIM messages and multicast data traffic forwarded by PEs MUST follow the split-horizon rule for mesh PWs as defined in [VPLS-LDP]. - PIM snooping states in a PE MUST be per VPLS instance. - Multicast traffic MUST be replicated per PW and AC basis, i.e., even if there are more than one PIM neighbor behind a PW/AC, only one replication MUST be sent to that PW/AC. 3.1.1. Snooping PIM Packets PIM-SM and PIM-BIDIR snooping PEs need to snoop on just the PIM Hello and PIM Join/Prune messages to build its multicast states. - PIM-DM snooping PEs have to also snoop on PIM Graft and PIM State Refresh messages. 3.2. Discovering PIM Routers A PIM Snooping PE MUST snoop on PIM Hellos received on ACs and PWs. i.e. the PE transparently floods the PIM Hello while snooping on it. PIM Hellos are used by the snooping switch to discover PIM routers and their characteristics. For each neighbor discovered by a PE, it includes an entry in the PIM Neighbor Database with the following fields: [Page 11] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 - Layer 2 encapsulation for the Router sending the PIM Hello. - IP Address and address family of the Router sending the PIM Hello. - Port (AC / PW) on which the PIM Hello was received. - Hello TLVs The PE should be able to interpret and act on Hello TLVs currently defined in the PIM RFCs. The TLVs of particular interest in this document are: - Hello-Hold-Time - Tracking Support - DR Priority Please refer to [PIM-SM] for a list of the Hello TLVs. When a PIM Hello is received, the PE MUST reset the neighbor-expiry- timer to Hello-Hold-Time. If a PE does not receive a Hello message from a router within Hello-Hold-Time, the PE MUST remove that neighbor from its PIM Neighbor Database. If a PE receives a Hello message from a router with Hello-Hold-Time value set to zero, the PE MUST remove that router from the PIM snooping state immediately. From the PIM Neighbor Database, a PE MUST be able to use the procedures defined in [PIM-SM] to identify the PIM Designated Router in the VPLS instance. It should also be able to determine if Tracking Support is active in the VPLS instance. 3.3. PIM-SM and PIM-SSM The key characteristic of PIM-SM and PIM-SSM is explicit join behavior. In this model, multicast traffic is only forwarded to locations that specifically request it. The root node of a tree is the Rendezvous Point (RP) in case of a shared tree (PIM-SM only) or the first hop router that is directly connected to the multicast source in the case of a shortest path tree. All the procedures described in this section apply to both PIM-SM and PIM-SSM, except for the fact that there is no (*,G) state in PIM-SSM. The procedures to discover PIM-SM routers in a VPLS instance are as described in section 3.2 [Page 12] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 3.3.1. Building PIM-SM Snooping States PIM-SM and PIM-SSM Snooping states are built by snooping on the PIM- SM Join/Prune messages received on AC/PWs. The downstream state machine of a PIM-SM snooping switch very closely resembles the downstream state machine of PIM-SM routers. The downstream state consists of: Per downstream (Port, *, G): - DownstreamJPState: One of { "NoInfo" (NI), "Join" (J), "Prune Pending" (PP) } Per downstream (Port, *, G, N): - Prune Pending Timer (PPT(N)) - Join Expiry Timer (ET(N)) Per downstream (Port, S, G): - DownstreamJPState: One of { "NoInfo" (NI), "Join" (J), "Prune Pending" (PP) } Per downstream (Port, S, G, N): - Prune Pending Timer (PPT(N)) - Join Expiry Timer (ET(N)) Per downstream (Port, S, G, rpt): - DownstreamJPRptState: One of { "NoInfo" (NI), "Pruned" (P), "Prune Pending" (PP) } Per downstream (Port, S, G, rpt, N): - Prune Pending Timer (PPT(N)) - Join Expiry Timer (ET(N)) Where S is the address of the multicast source, G is the Group address and N is the upstream neighbor field in the Join/Prune message. Notice that unlike on PIM-SM routers where PPT and ET are per (Interface, S, G), PIM Snooping switches have to maintain PPT and ET per (Port, S, G, N). The reasons for this are explained in section 3.3.2 Apart from the above states, we define the following state summarization macros. UpstreamNeighbors(*,G): If there is one or more Join(*,G) received on any port with upstream neighbor N and ET(N) is active, then N is added to UpstreamNeighbors(*,G). This set is used to determine if a Join(*,G) or a Prune(*,G) with upstream neighbor N needs to be sent upstream. [Page 13] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 UpstreamNeighbors(S,G): If there is one or more Join(S,G) received on any port with upstream neighbor N and ET(N) is active, then N is added to UpstreamNeighbors(S,G). This set is used to determine if a Join(S,G) or a Prune(S,G) with upstream neighbor N needs to be sent upstream. UpstreamPorts(*,G): This is the set of all Rport(N) ports where N is in the set UpstreamNeighbors(*,G). Multicast Streams forwarded using a (*,G) match MUST be forwarded to these ports in addition to downstream ports. So UpstreamPorts(*,G) MUST be added to OutgoingPortList(*,G). UpstreamPorts(S,G): This is the set of all Rport(N) ports where N is in the set UpstreamNeighbors(S,G). UpstreamPorts(S,G) MUST be added to OutgoingPortList(S,G). UpstreamPorts(S,G,rpt): If PruneDesired(S,G,rpt) becomes true, then this set is set to UpstreamPorts(*,G). Otherwise, this set is empty. UpstreamPorts(*,G) (-) UpstreamPorts(S,G,rpt) MUST be added to OutgoingPortList(S,G). See section 3.8.1 on Data Forwarding Rules for the specification on how OutgoingPortList(S,G) is calculated. UpstreamPorts(G): This set is the union of all the UpstreamPorts(S,G) and UpstreamPorts(*,G) for a given G. Proxy (S,G) Join/Prune and (*,G) Join/Prune messages MUST be sent to a subset of UpstreamPorts(G) as specified in section 3.3.8. PWPorts: This is the set of all PWs. OutgoingPortList(*,G): This is the set of all ports to which traffic needs to be forwarded on a (*,G) match. Split Horizon rules apply as noted in section 3.8 OutgoingPortList(S,G): This is the set of all ports to which traffic needs to be forwarded on an (S,G) match. Split Horizon rules apply as noted in section 3.8 NumETsActive(Port,*,G): Number of (Port,*,G,N) entries that have Expiry Timer running. This macro keeps track of the number of Join(*,G)s that are received on this Port with different upstream neighbors. NumETsActive(Port,S,G): Number of (Port,S,G,N) entries that have Expiry Timer running. This macro keeps track of the number of Join(*,G)s that are received on this Port with different upstream neighbors. RpfVectorTlvs(*,G): RPF Vectors [RPF-VECTOR] are TLVs that may be present in received Join(*,G) messages. If present, they must be copied to RpfVectorTlvs(*,G). [Page 14] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 RpfVectorTlvs(S,G): RPF Vectors [RPF-VECTOR] are TLVs that may be present in received Join(S,G) messages. If present, they must be copied to RpfVectorTlvs(S,G). Since there are a few differences between the downstream state machines of PIM-SM Routers and PIM-SM snooping switches, we specify the details of the downstream state machine of PIM-SM snooping switches at the risk of repeating most of the text documented in [PIM-SM]. 3.3.2. Explanation for per (S,G,N) states In PIM Routing protocols, states are built per (S,G). On a router, an (S,G) has only one RPF-Neighbor. However, a PIM Snooping switch does not have the Layer 3 routing information available to the routers in order to determine the RPF-Neighbor for a multicast flow. It merely discovers it by snooping the Join/Prune message. A PE could have snooped on two or more different Join/Prune messages for the same (S,G) that could have carried different Upstream-Neighbor fields. This could happen during transient network conditions or due to dual- homed sources. A PE cannot make assumptions on which one to pick, but instead must facilitate the CE routers decide which Upstream Neighbor gets elected the RPF-Neighbor. And for this purpose, the PE will have to track downstream and upstream Join/Prune states per (S,G,N). 3.3.3. Receiving (*,G) PIM-SM Join/Prune Messages A Join(*,G) or Prune(*,G) is "received" when the port on which it was received is not also the port on which the upstream-neighbor N of the Join/Prune(*,G) was learnt. When a router receives a Join(*,G) or a Prune(*,G) with upstream neighbor N, it must process the message as defined in the state machine below. Note that the macro computations of the various macros resulting from this state machine transition is exactly as specified in the PIM-SM RFC [PIM-SM]. [Page 15] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 We define the following per-port (*,G,N) macro to help with the state machine below. Figure 1: Downstream per-port (*,G) state machine in tabular form +---------------++----------------------------------------+ | || Previous State | | ++------------+--------------+------------+ | Event ||NoInfo (NI) | Join (J) | Prune-Pend | +---------------++------------+--------------+------------+ | Receive ||-> J state | -> J state | -> J state | | Join(*,G) || Action | Action | Action | | || RxJoin(N) | RxJoin(N) | RxJoin(N) | +---------------++------------+--------------+------------+ |Receive || - | -> PP state | -> PP state| |Prune(*,G) and || | Start PPT(N) | | |NumETsActive<=1|| | | | +---------------++------------+--------------+------------+ |Receive || - | -> J state | - | |Prune(*,G) and || | Start PPT(N) | | NumETsActive>1 || | | | +---------------++------------+--------------+------------+ |PPT(N) expires || - | -> J state | -> NI state| | || | Action | Action | | || | PPTExpiry(N) |PPTExpiry(N)| +---------------++------------+--------------+------------+ |ET(N) expires || - | -> NI state | -> NI state| |and || | Action | Action | |NumETsActive<=1|| | ETExpiry(N) | ETExpiry(N)| +---------------++------------+--------------+------------+ |ET(N) expires || - | -> J state | -> NI state| |and || | Action | Action | |NumETsActive>1 || | ETExpiry(N) | ETExpiry(N)| +---------------++------------+--------------+------------+ Action RxJoin(N): If ET(N) is not already running, then start ET(N). Otherwise restart ET(N). If N is not already in UpstreamNeighbors(*,G), then add N to UpstreamNeighbors(*,G) and trigger a Join(*,G) with upstream neighbor N to be forwarded upstream as specified in section 3.3.8 Record N as RPF_Neighbor(*,G). If there are RPF Vector TLVs in the received (S,G) message and if they are different from the recorded RpfVectorTlvs(S,G), then copy them into RpfVectorTlvs(S,G). Also trigger a Join(S,G) with upstream neighbor N to be forwarded upstream as specified in section 3.3.8 [Page 16] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 Action PPTExpiry(N): Disable timers ET(N) and PPT(N). If there are no other (Port,*,G) states with NumETsActive(Port,*,G) > 0, then trigger a Prune(*,G) with upstream neighbor N to be forwarded upstream as specified in section 3.3.8 Then delete N from UpstreamNeighbors(*,G). Send a Prune-Echo(*,G) with upstream-neighbor N on the downstream port. Action ETExpiry(N): Disable timers ET(N) and PPT(N). If there are no other (Port,*,G) states with NumETsActive(Port,*,G) > 0, then trigger a Prune(*,G) with upstream neighbor N to be forwarded upstream as specified in section 3.3.8 Then delete N from UpstreamNeighbors(*,G). [Page 17] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 3.3.4. Receiving (S,G) PIM-SM Join/Prune Messages A Join(S,G) or Prune(S,G) is "received" when the port on which it was received is not also the port on which the upstream-neighbor N of the Join/Prune(S,G) was learnt. When a router receives a Join(S,G) or a Prune(S,G) with upstream neighbor N, it must process the message as defined in the state machine below. Note that the macro computations of the various macros resulting from this state machine transition is exactly as specified in the PIM-SM RFC [PIM-SM]. Figure 2: Downstream per-port (S,G) state machine in tabular form +---------------++----------------------------------------+ | || Previous State | | ++------------+--------------+------------+ | Event ||NoInfo (NI) | Join (J) | Prune-Pend | +---------------++------------+--------------+------------+ | Receive ||-> J state | -> J state | -> J state | | Join(S,G) || Action | Action | Action | | || RxJoin(N) | RxJoin(N) | RxJoin(N) | +---------------++------------+--------------+------------+ |Receive || - | -> PP state | -> PP state| |Prune (S,G) and|| | Start PPT(N) | | |NumETsActive<=1|| | | | +---------------++------------+--------------+------------+ |Receive || - | -> J state | - | |Prune(S,G) and || | Start PPT(N) | | NumETsActive>1 || | | | +---------------++------------+--------------+------------+ |PPT(N) expires || - | -> J state | -> NI state| | || | Action | Action | | || | PPTExpiry(N) |PPTExpiry(N)| +---------------++------------+--------------+------------+ |ET(N) expires || - | -> NI state | -> NI state| |and || | Action | Action | |NumETsActive<=1|| | ETExpiry(N) | ETExpiry(N)| +---------------++------------+--------------+------------+ |ET(N) expires || - | -> J state | -> NI state| |and || | Action | Action | |NumETsActive>1 || | ETExpiry(N) | ETExpiry(N)| +---------------++------------+--------------+------------+ Action RxJoin(N): If ET(N) is not already running, then start ET(N). Otherwise, restart ET(N). [Page 18] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 If N is not already in UpstreamNeighbors(S,G), then add N to UpstreamNeighbors(S,G) and trigger a Join(S,G) with upstream neighbor N to be forwarded upstream as specified in section 3.3.8 Record N as RPF_Neighbor(S,G). If there are RPF Vector TLVs in the received (S,G) message and if they are different from the recorded RpfVectorTlvs(S,G), then copy them into RpfVectorTlvs(S,G). Also trigger a Join(S,G) with upstream neighbor N to be forwarded upstream as specified in section 3.3.8 Action PPTExpiry(N): Disable timers ET(N) and PPT(N). If there are no other (Port,S,G) states with NumETsActive(Port,S,G) > 0, then trigger a Prune(S,G) with upstream neighbor N to be forwarded upstream as specified in section 3.3.8Then delete N from UpstreamNeighbors(S,G). Send a Prune-Echo(S,G) with upstream-neighbor N on the downstream port. Action ETExpiry(N): Disable timers ET(N) and PPT(N). If there are no other (Port,S,G) states with NumETsActive(Port,S,G) > 0, then trigger a Prune(S,G) with upstream neighbor N to be forwarded upstream as specified in section 3.3.8 Then delete N from UpstreamNeighbors(S,G). 3.3.5. Receiving (S,G,rpt) Join/Prune Messages A Join(S,G,rpt) or Prune(S,G,rpt) is "received" when the port on which it was received is not also the port on which the upstream-neighbor N of the Join/Prune(S,G,rpt) was learnt. While it is important to ensure that the (S,G) and (*,G) state machines allow for handling per (S,G,N) states, it is not as important for (S,G,rpt) states. It suffices to say that the downstream (S,G,rpt) state machine is the same as what is defined in section 4.5.4 of the PIM-SM RFC [PIM-SM]. 3.3.6. Sending (*,G) Join/Prune Messages A PIM Proxy PE MUST implement the Upstream (*,G) state machine for which the procedures are similar to what is defined in section 4.5.6 of [PIM-SM]. Section 3.3.8of this draft specifies how the message should be sent. For the purposes of the Upstream (*,G) state machine, a Join(*,G) or Prune(*,G) message with upstream neighbor N is "seen" on a PIM Snooping switch if the port on which the message was received is also the port on which the upstream neighbor N was learnt. [Page 19] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 3.3.7. Sending (S,G) Join/Prune Messages A PIM Proxy PE MUST implement the Upstream (S,G) state machine for which the procedures are similar to what is defined in section 4.5.6 of [PIM-SM]. Section 3.3.8of this draft specifies how the message should be sent. For the purposes of the Upstream (S,G) state machine, a Join(*,G) or Prune(*,G) message with upstream neighbor N is "seen" on a PIM Snooping switch if the port on which the message was received is also the port on which the upstream neighbor N was learnt. 3.3.8. Sending PIM Join/Prune message upstream. Sending of PIM Join/Prune messages upstream is only required on a PIM Proxy Switch and not on a PIM Snooping Switch. This section applies only to a PIM Proxy Switch. The downstream Join/Prune state machines above describe when PIM Join/Prune packets must be forwarded upstream and with what upstream neighbor field. In order to correctly facilitate assert among the CE routers, such Join/Prunes need to sent not only towards the upstream neighbor, but also on certain PWs as described below. It is important to note that Join/Prune packets are sent to a subset of the ports in UpstreamPorts(G) and is not simply flooded to all PWs. If RpfVectorTlvs(*,G) is not empty, then it must be encoded in a Join(*,G) message sent upstream. If RpfVectorTlvs(S,G) is not empty, then it must be encoded in a Join(S,G) message sent upstream. 3.3.8.1. Sending Triggered vs Refresh Join/Prune messages A Join is a refresh join if it is being sent as a result of upstream join timer expiry. If the join is being sent because there was a change in the downstream join/prune state machine, then it is a triggered join. If the Join/Prune message being sent out is a refresh Join(*,G) message, then send the refresh Join(*,G) on all ports in UpstreamPorts(G). The Upstream Neighbor field should be the recorded RPF_Neighbor(*,G). If the Join/Prune message being sent out is a refresh Join(S,G) message, then send the refresh Join(S,G) on all ports in UpstreamPorts(G). The Upstream Neighbor field should be the recorded RPF_Neighbor(S,G). If the Join/Prune message being sent out is a triggered Join/Prune message (due to an event in the downstream Join/Prune state machine), then the following rules apply. These rules apply to both (S,G) and (*,G) Join/Prune messages to be sent out: [Page 20] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 - The upstream neighbor field N in the Join/Prune to be sent is dictated by the downstream Join/Prune state machine transition. - If the downstream Join/Prune event was on an AC port, then send the upstream Join/Prune message to all PWs in UpstreamPorts(G). Send the Join/Prune message to Rport(N) also. - If the downstream Join/Prune event was on a PW port and if Rport(N) is a PW, then silently discard the Join/Prune message without sending it. If Rport(N) is an AC, then send the Join/Prune message on that AC. 3.3.9. Triggering ASSERT Election in PIM-SM In PIM-SM, there are scenarios where multiple routers could be forwarding the same multicast traffic on a LAN. When this happens, using PIM Assert Election process by sending PIM Assert Messages, routers ensure that only the Assert Winner forwards traffic on the LAN. In a typical LAN, the Assert Election is a data driven event and happens only if a router sees traffic on the interface to which it should be forwarding the traffic. Therefore, in the case of VPLS, in order to trigger Assert Election and stop duplicate traffic, it is necessary that two routers that are forwarding duplicate traffic for an (S,G)/(*,G) see each other's traffic. PIM Snooping switches must hence ensure that they not only forward multicast traffic for an (S,G) on the ports on which they snooped Joins(S,G)/Joins(*,G), but also on the ports on which such Joins were forwarded (i.e. towards the upstream neighbor(s)). So if two or more Joins(S,G) each carrying a different upstream neighbor field were snooped at a PE, then the ports on which such Joins were snooped along with the ports on which the upstream neighbors were learnt must be added to the outgoing port list. The above logic needs to be facilitated without breaking VPLS Split Horizon Rules. i.e. traffic should not be forwarded on the port on which it was received. And traffic arriving on a PW MUST NOT be forwarded onto other PW(s). The rules specified above in calculating the outgoing port list ensures this. An example network scenario is discussed in Appendix A with possible ASSERT Election scenarios. 3.4. Bidirectional-PIM (PIM-BIDIR) PIM-BIDIR is a variation of PIM-SM. The main differences between PIM-SM and Bidirectional-PIM are as follows: [Page 21] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 - There are no source-based trees, and source-specific multicast is not supported (i.e., no (S,G) states) in PIM- BIDIR. - Multicast traffic can flow up the shared tree in PIM-BIDIR. - To avoid forwarding loops, one router on each link is elected as the Designated Forwarder (DF) for each RP in PIM-BIDIR. The main advantage of PIM-BIDIR is that it scales well for many-to- many applications. However, the lack of source-based trees means that multicast traffic is forced to remain on the shared tree. The procedures to discover PIM-SM routers in a VPLS instance are as described in section 3.2 For PIM-BIDIR to work properly, all routers within the domain must know the address of the RP. During RP discovery time, PIM routers elect DF per subnet for each RP. The algorithm to elect the DF is as follows: all PIM neighbors in a subnet advertise their unicast route to elect the RP and the router with the best route is elected. Snooping for PIM-BIDIR is much simpler than it is for PIM-SM. The complexity resulting from various combinations of (S,G), (*,G), IGMP and assert states makes PIM-SM procedures fairly complex. PIM-BIDIR has none of those issues since PIM-BIDIR builds only (*,G) states and all routers on a LAN agree on who the upstream neighbor, i.e. DF(RP) is. So the snooping procedures for PIM-BIDIR is very much like that on a PIM-BIDIR router [PIM-BIDIR]. 3.4.1. Building PIM-BIDIR Snooping States The PEs MUST snoop on PIM Hello and PIM-BIDIR Join/Prune packets and build states as described in [PIM-BIDIR]. The PEs SHOULD simply flood all other PIM packet types without examining them. PIM Proxy Rules specified in section 2.5can be applied to PIM-BIDR also. Only additional requirement is that if the Upstream Port of a PIM-BIDIR group is a PW, then the proxy PIM Join/Prune packet MUST be sent on all PWs. 3.5. PIM-DM The characteristics of PIM-DM is flood and prune behavior. Shortest path trees are built as a multicast source starts transmitting. The procedures to discover PIM-DM routers are as explained in section 3.2 [Page 22] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 3.5.1. Building PIM-DM Snooping States PIM-DM Snooping states are built by snooping on the PIM-DM Join, Prune, Graft and State Refresh messages received on AC/PWs and State- Refresh Messages sent on AC/PWs. By snooping on these PIM-DM messages, a PE builds the following states per (S,G,N) where S is the address of the multicast source, G is the Group address and N is the upstream neighbor to which Prunes/Grafts are sent by downstream CEs: Per PIM (S,G,N): Per Port PIM (S,G,N) Prune State: - DownstreamPState(S,G,N,Port): One of {"NoInfo" (NI), "Pruned" (P), "PrunePending" (PP)} - Prune Pending Timer (PPT) - Prune Timer (PT) - Upstream Port (valid if the PIM(S,G,N) Prune State is "Pruned"). 3.5.2. PIM-DM Downstream Per-Port PIM(S,G,N) State Machine The downstream per-port PIM(S,G,N) state machine is as defined in section 4.4.2 of [PIM-DM] with a few changes relevant to PIM Snooping. When reading section 4.4.2 of [PIM-DM] for the purposes of PIM-Snooping please be aware that the downstream states are built per (S, G, N, Downstream-Port} in PIM-Snooping and not per {Downstream- Interface, S, G} as in a PIM-DM router. As noted in the previous section 3.5.1, the states (DownstreamPState) and timers (PPT and PT) are per (S,G,N,P). 3.5.3. Triggering ASSERT election in PIM-DM Since PIM-DM is a flood-and-prune protocol, traffic is flooded to all routers unless explicitly pruned. Since PIM-DM routers do not prune on non-RPF interfaces, PEs should typically not receive Prunes on Rport(RPF-neighbor). So the asserting routers should typically be in pim_oiflist(S,G). In most cases, assert election should occur naturally without any special handling since data traffic will be forwarded to the asserting routers. However, there are some scenarios where a prune might be received on a port which is also an upstream port (UP). If we prune the port from pim_oiflist(S,G), then it would not be possible for the asserting routers to determine if traffic arrived on their downstream port. This can be fixed by adding pim_iifs(S,G) to pim_oiflist(S,G) so that data traffic flows to the UP ports. [Page 23] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 3.6. PIM Proxy As noted earlier in section 2.5, PIM Snooping will work correctly only if Join Suppression is disabled in the VPLS. If Join Suppression is enabled in the VPLS, then PEs MUST do PIM Proxy for VPLS Multicast to work correctly. A PIM Proxy switch behaves like a PIM Router by doing most of the functionality of a PIM Router. The complexity however is much lesser on a switch since many of the issues that a PIM Router has to deal with are not relevant on a switch. A PIM Router needs to be able to build and maintain RP-Sets. They also have to deal with the Register and Assert State Machines. There are other complexities for a PIM Router resulting from inter-domain multicast. A PIM Snooping or PIM Proxy switch can be agnostic of all of this. All that a PIM Proxy switch cares about is building multicast states using PIM Hellos and PIM Join/Prune message. As such it's complexity is greatly reduced. Other than the procedures defined here, the rest of the procedures that apply to PIM Snooping apply to PIM Proxy as well. 3.6.1. Downstream PIM Proxy behavior A PIM-SM or PIM-BIDIR Proxy PE is interested in the Hello and Join/Prune messages. The proposed PIM Proxy solution for PIM-SM and PIM-BIDIR is to proxy only Join/Prune messages. PIM Proxy for PIM-DM is for future study. PIM Hellos MUST be snooped while being flooded in the VPLS. i.e. PIM Hellos MUST NOT be consumed at a PE and regenerated. PIM Join/Prune messages arriving at an AC MUST be consumed. If [VPLS- MCAST-LDP] is not used to distribute multicast states among the PEs, then PIM Join/Prune messages arriving at a PW MUST also be consumed. All other PIM packet types are flooded in the VPLS without needing observation. Performing only proxy of Join/Prune messages keeps the switch behavior very similar to that of a PIM router without introducing too much additional complexity. It keeps the PIM Proxy solution fairly simple. Since Join/Prunes are forwarded by a PE along the slow-path and all other PIM packet types are forwarded along the fast-path, it is very likely that packets forwarded along the fast-path will arrive "ahead" of Join/Prune packets at a CE router (note the stress on the fact that fast-path messages will never arrive after Join/Prunes). Of particular importance are Hello packets sent along the fast-path. We can construct a variety of scenarios resulting in out of order delivery of Hellos and Join/Prune messages. However, there should be [Page 24] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 no deviation from normal expected behavior observed at the CE router receiving these messages out of order. The other option for a PIM Proxy solution is to proxy both Hello and Join/Prune messages that a PE is interested in building states for. If Hellos are being proxied, then it becomes necessary that the PE proxy all other PIM packet types also. Because if Hellos are received after other packet types are received at a CE router, then bad things will happen. That means every PIM packet has to be sent along the slow-path. This greatly increases the complexity on the CE router, it is very compute intensive and does not scale well. Also, proxying Hellos will result in added latency to delivery of Hello messages to a CE and that affects multicast convergence in the VPLS. 3.6.2. Upstream PIM Proxy behavior Since a PIM Proxy switch consumes Join/Prune messages, it must also originate PIM Join/Prune messages to be sent upstream. If [VPLS- MCAST-LDP] is employed, then triggered Join/Prune messages are sent via LDP to forward PIM Join/Prunes on PWs. Join/Prune messages need not be refreshed on PWs when [VPLS-MCAST-LDP] is employed. On ACs, both triggered and refresh Join/Prunes are forwarded as PIM packets. 3.6.3. Source IP Address in Proxy PIM Join/Prune Packets The source IP address in PIM packets sent upstream SHOULD be the address of a PIM neighbor in the VPLS. The address picked MUST NOT be the upstream neighbor field to be encoded in the packet. The layer 2 encapsulation for the selected source IP address MUST be the encapsulation recorded in the PIM Neighbor database for that IP address. If Explicit Tracking (ET) is disabled in the VPLS, then it does not matter what Source IP Address is picked in the packets sent upstream as long as we adhere to the rule in the previous paragraph. If ET is enabled, it means that a CE router is interested in tracking every CE that wishes to join a stream. If a PE determines that ET is enabled, then it SHOULD use PIM Snooping procedures instead of PIM Proxy. 3.7. Directly Connected Multicast Source If there is a source in the CE network that connects directly into the VPLS instance, then multicast traffic from that source MUST be sent to all PIM routers on the VPLS instance apart from the igmp receivers in the VPLS. If there is already (S,G) or (*,G) snooping state that is formed on any PE, this will not happen per the current forwarding rules and guidelines. So, in order to determine if traffic needs to be flooded to all routers, a PE must be able to determine if the traffic came from a host on that LAN. There are three ways to address this problem: [Page 25] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 - The PE would have to do ARP snooping to determine if a source is directly connected. - Another option is to have configuration on all PEs to say there are CE sources that are directly connected to the VPLS instance and disallow snooping for the groups for which the source is going to send traffic. This way traffic from that source to those groups will always be flooded within the provider network. - A third option is to require that sources of CE multicast routers must appear behind a router. 3.8. Data Forwarding Rules First we define the rules that are common to PIM-SM, PIM-BIDIR and PIM-DM PEs. Forwarding rules for each protocol type is specified in the sub-sections. If there is no matching forwarding state, then the PE MAY either discard the packet or send it towards all the snooped PIM CE routers or to a configured set of ports. How this is determined is outside the scope of this document. The following rules MUST be followed when forwarding multicast traffic in a VPLS: - Traffic arriving on a port MUST NOT be forwarded back onto the same port. - Due to VPLS Split-Horizon rules, traffic ingressing on a PW MUST NOT be forwarded to any other PW. 3.8.1. PIM-SM Data Forwarding Rules Per the rules in [PIM-SM] and per the additional rules specified in this document, OutgoingPortList(*,G) = inherited_olist(*,G) (+) UpstreamPorts(*,G) (+) Rport(PimDR) OutgoingPortList(S,G) = inherited_olist(S,G) (+) UpstreamPorts(S,G) (+) (UpstreamPorts(*,G) (-) UpstreamPorts(S,G,rpt)) (+) Rport(PimDR) [PIM-SM] specifies how inherited_olist(*,G) and inherited_olist(S,G) are built. PimDR is the IP address of the PIM DR in the VPLS. The PIM-SM Snooping forwarding rules are defined below in pseudocode: BEGIN [Page 26] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 iif is the incoming port of the multicast packet. S is the Source IP Address of the multicast packet. G is the Destination IP Address of the multicast packet. If there is (S,G) state on the PE Then OutgoingPortList = OutgoingPortList(S,G) Else if there is (*,G) state on the PE Then OutgoingPortList = OutgoingPortList(*,G) Else OutgoingPortList = UserDefinedPortList Endif If iif is an AC Then OutgoingPortList = OutgoingPortList (-) iif Else ## iif is a PW OutgoingPortList = OutgoingPortList (-) PWPorts Endif Forward the packet to OutgoingPortList. END First if there is (S,G) state on the PE, then the set of outgoing ports is OutgoingPortList(S,G). Otherwise if there is (*,G) state on the PE, the set of outgoing ports is OutgoingPortList(*,G). The packet is forwarded to the selected set of outgoing ports while observing the rules above in section 3.8 3.8.2. PIM-BIDIR Data Forwarding Rules The PIM-BIDIR Snooping forwarding rules are defined below in pseudocode: BEGIN iif is the incoming port of the multicast packet. G is the Destination IP Address of the multicast packet. If there is forwarding state for G Then OutgoingPortList = olist(G) Else OutgoingPortList = UserDefinedPortList Endif [Page 27] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 If iif is an AC Then OutgoingPortList = OutgoingPortList (-) iif Else ## iif is a PW OutgoingPortList = OutgoingPortList (-) PWPorts Endif Forward the packet to OutgoingPortList. END If there is forwarding state for G, then forward the packet to olist(G) while observing the rules above in section 3.8 [PIM-BIDIR] specifies how olist(G) is contructed. 3.8.3. PIM-DM Data Forwarding Rules The PIM-DM Snooping data forwarding rules are defined below in pseudocode: BEGIN iif is the incoming port of the multicast packet. S is the Source IP Address of the multicast packet. G is the Destination IP Address of the multicast packet. If there is (S,G) state on the PE Then OutgoingPortList = olist(S,G) Else OutgoingPortList = UserDefinedPortList Endif If iif is an AC Then OutgoingPortList = OutgoingPortList (-) iif Else ## iif is a PW OutgoingPortList = OutgoingPortList (-) PWPorts Endif Forward the packet to OutgoingPortList. END If there is forwarding state for (S,G), then forward the packet to olist(S,G) while observing the rules above in section 3.8 [PIM-DM] specifies how olist(S,G) is contructed. 4. IANA Considerations This document does not require any IANA assignments or action. [Page 28] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 5. Security Considerations Security considerations provided in VPLS solution documents (i.e., [VPLS-LDP] and [VPLS-BGP) apply to this document as well. 6. Acknowledgements Many members of the L2VPN and PIM working groups have contributed to and provided valuable comments and feedback to this draft, including Vach Kompella, Shane Amante, Sunil Khandekar, Rob Nath, Marc Lassere, Yuji Kamite, Yiqun Cai, Ali Sajassi, Jayant Kotalwar, Jozef Raets, Himanshu Shah (Ciena), Himanshu Shah (Alcatel-Lucent). 7. References 7.1. Normative References [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [PIM-DM] Deering, S., et al. "Protocol Independent Multicast Version 2 - Dense Mode Specification", RFC 3973, January 2005. [PIM-SM] Fenner, W, et al. "Protocol Independent Multicast- Sparse Mode (PIM-SM): Protocol Specification (Revised)", RFC 4601, August 2006. [PIM-SSM] Holbrook, H., et al. "Source-Specific Multicast for IP", RFC 4607, August 2006 [PIM-BIDIR] Handley, M., et al. "Bi-directional Protocol Independent Multicast (BIDIR-PIM)", work in progress [RPF-VECTOR] IJ Wijnands, et al, "The RPF Vector TLV", draft-ietf-pim-rpf-vector-03, Work in progress 7.2. Informative References [VPLS-LDP] Lasserre, M, et al. "Virtual Private LAN Services using LDP Signaling", RFC 4762, January 2007 [VPLS-BGP] Kompella, K, et al. "Virtual Private LAN Service using BGP for Auto-Discovery and Signaling", RFC 4761, January 2007 [IGMP-SNOOP] Christensen, M., et al. "Considerations for IGMP and MLD Snooping Switches", RFC 4541, May 2006 [VPLS-MCAST-REQ] Kamite, Y, et al, "Requirements for Multicast Support in Virtual Private LAN Services", draft-ietf-l2vpn-vpls-mcast-reqts-03, Work in Progress [VPLS-MCAST-LDP] Qui, R, Serbest, Y, et al, "Using LDP for VPLS Multicast", draft-qiu-serbest-vpls-mcast-ldp-00.txt, Work in progress [Page 29] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 [VPLS-MCAST-BGP] Aggarwal, R, et al, "Propagation of VPLS IP Multicast Group Membership Information", draft- raggarwa-l2vpn-vpls-mcast-ctrl-00.txt, Work in progress [VPLS-MCAST-TREES] Aggarwal, R, et al. "Multicast in VPLS", draft-raggarwa-l2vpn-vpls-mcast-01.txt, Work in progress. [Page 30] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 Appendix A. Example Network Scenario Let us consider the scenario in Figure 3. Figure 3: An Example Scenario for Triggering Assert +------+ AC3 +------+ | PE2 |-----| CE3 | /| | | | / +------+ +------+ / | | / | | /PW12 | | / | +-----+ / |PW23 | S | / | +-----+ / | | / | | / | | +------+ +------+ / +------+ +------+ | CE1 | | PE1 |/ PW13 | PE3 | | CE4 | | |-----| |-------------| |-----| | +------+ AC1 +------+ +------+ AC4 +------+ | |AC2 +------+ | CE2 | | | +------+ In the scenario depicted in Figure 3, S is the source of a multicast stream (S,G). CE1 and CE2 both have two ECMP routes to reach the source. In the examples below, JT(Port,S,G,N) is the downstream Join Expiry Timer on the specified Port for the (S,G) with upstream neighbor N. Appendix A.1 PIM-Snooping Example 1. CE1 Sends a Join(S,G) with Upstream Neighbor(S,G) = CE3. 2. PE1 snoops on the Join(S,G) while flooding it in the VPLS. PE2 and PE3 also snoop on the Join(S,G) while flooding it in the VPLS. The resulting states at the PEs is as follows: [Page 31] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 At PE1: JT(AC1,S,G,CE3) = JP_HoldTime UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { PW12 } OutgoingPortList(S,G) = { AC1, PW12 } At PE2: JT(PW12,S,G,CE3) = JP_HoldTime UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { AC3 } OutgoingPortList(S,G) = { PW12, AC3 } At PE3: PE3 ignores the Join(S,G) for the following reasons: . It does not already have (S,G) state. . The Join(S,G) was received on a PW and the Upstream RPort is also a PW. 3. The multicast stream (S,G) flows along CE3 -> PE2 -> PE1 -> CE1 4. Now CE2 sends a Join(S,G) with Upstream Neighbor(S,G) = CE4. 5. All PEs snoop on the Join(S,G). The resulting states at the PEs: At PE1: JT(AC1,S,G,CE3) = active JT(AC2,S,G,CE4) = JP_HoldTime. UpstreamNeighbors(S,G) = { CE3, CE4 } UpstreamPorts(S,G) = { PW12, PW13 } OutgoingPortList(S,G) = { AC1, PW12, AC2, PW13 } At PE2: Note: Since PE2 already has (S,G) state, it does not ignore the Join(S,G) even though it received the Join(S,G) on a PW and the Upstream Rport is a PW. JT(PW12,S,G,CE4) = JP_HoldTime JT(PW12,S,G,CE3) = active UpstreamNeighbors(S,G) = { CE3, CE4 } UpstreamPorts(S,G) = { AC3, PW23 } OutgoingPortList(S,G) = { PW12, AC3, PW23 } At PE3: JT(PW13,S,G,CE4) = JP_HoldTime UpstreamNeighbors(S,G) = { CE4 } UpstreamPorts(S,G) = { AC4 } OutgoingPortList(S,G) = { PW13, AC4 } 6. The multicast stream (S,G) flows into the VPLS from the two CEs CE3 and CE4. PE2 forwards the stream received from CE3 to PW23 and PE3 forwards the stream to AC4. This facilitates the CE routers to trigger assert election. Let us say CE3 becomes the assert winner. [Page 32] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 7. CE3 sends an Assert message to the VPLS. The PEs flood the Assert message without examining it. 8. CE4 stops sending the multicast stream to the VPLS. 9. CE2 notices an RPF change due to Assert and sends a Prune(S,G) with Upstream Neighbor = CE4. CE2 also sends a Join(S,G) with Upstream Neighbor = CE3. 10. All the PEs start a prune-pend timer on the ports on which they received the Prune(S,G). When the prune-pend timer expires, all PEs will remove the downstream (S,G,CE4) states. Resulting states at the PEs: At PE1: JT(AC1,S,G,CE3) = active UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { PW12 } OutgoingPortList(S,G) = { AC1, AC2, PW12 } At PE2: JT(PW12,S,G,CE3) = active UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { AC3 } OutgoingPortList(S,G) = { PW12, AC3 } At PE3: no (S,G) state. Note that at the end of the assert election, there should be no duplicate traffic forwarded downstream and traffic should flow only on the desired path. Also note that there are no unnecessary (S,G) states on PE3 after the assert election. Appendix A.2 PIM Proxy Example with (S,G) / (*,G) interaction In the same network, let us assume CE4 is the Upstream Neighbor towards the RP for G. 1. CE1 Sends a Join(S,G) with Upstream Neighbor(S,G) = CE3. 2. PE1 consumes the Join(S,G). PE1 looks up the neighbor database and determines CE3 was learnt on PW12. PE1 sends a Proxy Join(S,G) to the resulting UpstreamPorts(G). i.e. it sends the proxy Join(S,G) on PW12. 3. Likewise, PE2 consumes the Join(S,G) and sends a proxy Join(S,G) on AC3 with Upstream Neighbor = CE3. The resulting states at the PEs is as follows: At PE1: JT(AC1,S,G,CE3) = JP_HoldTime UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { PW12 } [Page 33] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 OutgoingPortList(S,G) = { AC1, PW12 } At PE2: JT(PW12,S,G,CE3) = JP_HoldTime UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { AC3 } OutgoingPortList(S,G) = { PW12, AC3 } At PE3: PE3 did not receive any PIM Join(S,G). So it has no (S,G) state. 4. The multicast stream (S,G) flows along CE3 -> PE2 -> PE1 -> CE1. 5. Now let us say CE1 sends a Join(*,G) towards CE4. 6. PE1 consumes the Join(*,G). PE1 sends a Proxy Join(*,G) to the resulting UpstreamPorts(G). Since UpstreamPorts(G) now has both PW12 and PW13, the Join(*,G) gets sent on both PW12 and PW13. Note that the UpstreamPorts(S,G) and OutgoingPortList(S,G) inherit the corresponding (*,G) sets, but not vice versa. 7. PE2 and PE3 perform a similar function. PE2 received the Join(*,G) on a PW and the Upstream Neighbor is also on a PW. Hence PE2 only adds UpstreamPorts(*,G) to OutgoingPortList(*,G) and not the downstream port PW12. At PE1: JT(AC1,S,G,CE3) = active UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { PW12, PW13 } OutgoingPortList(S,G) = { AC1, PW12, PW13 } JT(AC1,*,G,CE4) = JP_HoldTime. UpstreamNeighbors(*,G) = { CE4 } UpstreamPorts(*,G) = { PW13 } OutgoingPortList(*,G) = { AC1, PW13 } UpstreamPorts(G) = { PW12, PW13 } At PE2: JT(PW12,S,G,CE3) = active UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { AC3, PW23 } OutgoingPortList(S,G) = { PW12, AC3, PW23 } JT(PW12,*,G,CE4) = JP_HoldTime UpstreamNeighbors(*,G) = { CE4 } UpstreamPorts(G) = { PW23 } OutgoingPortList(*,G) = { PW23 } At PE3: JT(PW13,*,G,CE4) = JP_HoldTime UpstreamNeighbors(*,G) = { CE4 } UpstreamPorts(*,G) = { AC4 } [Page 34] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 OutgoingPortList(*,G) = { PW13, AC4 } 8. The above state results in both (S,G) and (*,G) streams to be forwarded to AC1. The above state also results in the (S,G) stream to be forwarded from CE3 to CE4 resulting in an (S,G) assert election. Following the assert election, CE3 becomes the (S,G) assert winner. CE4 stops sending (S,G) stream down the RPT. 9. CE1 notices an RPF change due to assert. It sends a Prune(S,G,rpt) with Upstream Neighbor = CE4. 10. PE1 consumes the Prune(S,G,rpt) and forwards the Prune(S,G,rpt) to both PW12 and PW13. PE2 consumes the Prune(S,G,rpt) and updates its states. PE3 updates its states and forwards the Prune(S,G,rpt) on AC4. At PE1: JT(AC1,S,G,CE3) = active UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(S,G) = { PW12 } OutgoingPortList(S,G) = { AC1, PW12 } JT(AC1,*,G,CE4) = active. UpstreamNeighbors(*,G) = { CE4 } UpstreamPorts(*,G) = { PW13 } OutgoingPortList(*,G) = { AC1, PW13 } At PE2: JT(PW12,S,G,CE3) = active UpstreamNeighbors(S,G) = { CE3 } UpstreamPorts(*,G) = { AC3 } OutgoingPortList(S,G) = { PW12, AC3 } JT(PW12,*,G,CE4) = JP_HoldTime UpstreamNeighbors(*,G) = { CE4 } UpstreamPorts(*,G) = { PW23 } OutgoingPortList(*,G) = { PW23 } At PE3: JT(PW13,*,G,CE4) = JP_HoldTime UpstreamNeighbors(*,G) = { CE4 } UpstreamPorts(G) = { AC4 } OutgoingPortList(*,G) = { PW13, AC4 } Even in this example, at the end of the (S,G) / (*,G) assert election, there should be no duplicate traffic forwarded downstream and traffic should flow only to the desired CEs. Other more complex scenarios exist. This draft should addressin PIM- SM and the rules specified in this draft should ensure that assert is triggered among the CEs in all scenarios. [Page 35] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 Authors' Addresses Venu Hemige Alcatel-Lucent 701 East Middlefield Rd. Mountain View, CA 94043 Venu.hemige@alcatel-lucent.com Yetik Serbest AT&T Labs 9505 Arboretum Blvd. Austin, TX 78759 Yetik_serbest@labs.att.com Ray Qiu Alcatel-Lucent 701 East Middlefield Rd. Mountain View, CA 94043 Ray.Qiu@alcatel-lucent.com Suresh Boddapati Alcatel-Lucent 701 East Middlefield Rd. Mountain View, CA 94043 Suresh.boddapati@alcatel-lucent.com Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr@ietf.org. [Page 36] draft-ietf-l2vpn-vpls-pim-snooping-01.txt March, 2007 Full copyright statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. [Page 37]