Network Working Group                                     D. Bryant
Request for Comments: 2166                                3Com Corp
Category: Informational                                 P. Brittain
                                               Data Connection Ltd.
                                                          June 1997

                      APPN Implementer's Workshop
                         Closed Pages Document

                         DLSw v2.0 Enhancements

Status of this Memo

This memo provides information for the Internet community.  This memo
does not specify an Internet standard of any kind.  Distribution of
this memo is unlimited.

Abstract

   This document specifies

   - a set of extensions to RFC 1795 designed to improve the scalability
     of DLSw
   - clarifications to RFC 1795 in the light of the implementation
     experience to-date.

   It is assumed that the reader is familiar with DLSw and RFC 1795.  No
   effort has been made to explain these existing protocols or
   associated terminology.

   This document was developed in the DLSw Related Interest Group (RIG)
   of the APPN Implementers Workshop (AIW). If you would like to
   participate in future DLSw discussions, please subscribe to the DLSw
   RIG mailing lists by sending a mail to majordomo@raleigh.ibm.com
   specifying 'subscribe aiw-dlsw' as the body of the message.

Table of Contents

   1. INTRODUCTION ................................................    3
   2. HALT REASON CODES............................................    3
   3. SCOPE OF SCALABILITY ENHANCEMENTS............................    4
   4. OVERVIEW OF SCALABILITY ENHANCEMENTS.........................    6
   5. MULTICAST GROUPS AND ADDRESSING..............................    7
   5.1 USING MULTICAST GROUPS......................................    8
   5.2 DLSW MULTICAST ADDRESSES....................................    8
   6. DLSW MESSAGE TRANSPORTS......................................    8
   6.1 TCP/IP CONNECTIONS ON DEMAND................................    9
    6.1.1 TCP CONNECTIONS ON DEMAND RACE CONDITIONS................    9



Bryant & Brittain            Informational                      [Page 1]

RFC 2166              APPN Implementer's Workshop             June 1997


   6.2 SINGLE SESSION TCP/IP CONNECTIONS...........................    9
    6.2.1 EXPEDITED SINGLE SESSION TCP/IP CONNECTIONS..............   10
     6.2.1.1 TCP PORT NUMBERS......................................   10
     6.2.1.2 TCP CONNECTION SETUP..................................   10
     6.2.1.3 SINGLE SESSION SETUP RACE CONDITIONS..................   10
     6.2.1.4 TCP CONNECTIONS WITH NON-MULTICAST CAPABLE DLSW PEERS.   11
   6.3 UDP DATAGRAMS...............................................   12
    6.3.1 VENDOR SPECIFIC FUNCTIONS OVER UDP.......................   12
    6.3.2 UNICAST UDP DATAGRAMS....................................   12
    6.3.3 MULTICAST UDP DATAGRAMS..................................   13
   6.4 UNICAST UDP DATAGRAMS IN LIEU OF IP MULTICAST...............   13
   6.5 TCP TRANSPORT...............................................   14
   7. MIGRATION SUPPORT............................................   14
   7.1 CAPABILITIES EXCHANGE.......................................   14
   7.2 CONNECTING TO NON-MULTICAST CAPABLE NODES...................   15
   7.3 COMMUNICATING WITH MULTICAST CAPABLE NODES..................   15
   8. SNA SUPPORT..................................................   16
   8.1 ADDRESS RESOLUTION..........................................   16
   8.2 EXPLORER FRAMES.............................................   16
   8.3 CIRCUIT SETUP...............................................   17
   8.4 EXAMPLE SNA SSP MESSAGE SEQUENCE............................   17
   8.5 UDP RELIABILITY.............................................   19
    8.5.1 RETRIES..................................................   19
   9. NETBIOS......................................................   20
   9.1 ADDRESS RESOLUTION..........................................   21
   9.2 EXPLORER FRAMES.............................................   21
   9.3 CIRCUIT SETUP...............................................   21
   9.4 EXAMPLE NETBIOS SSP MESSAGE SEQUENCE........................   22
   9.5 MULTICAST RELIABILITY AND RETRIES...........................   24
   10. SEQUENCING..................................................   24
   11. FRAME FORMATS...............................................   25
   11.1 MULTICAST CAPABILITIES CONTROL VECTOR......................   25
    11.1.1 DLSW CAPABILITIES NEGATIVE RESPONSE.....................   26
   11.2 UDP PACKETS................................................   26
   11.3 VENDOR SPECIFIC UDP PACKETS................................   27
   12. COMPLIANCE STATEMENT........................................   28
   13. SECURITY CONSIDERATIONS.....................................   29
   14. ACKNOWLEDGEMENTS............................................   29
   15. AUTHORS' ADDRESSES..........................................   30
   16. APPENDIX - CLARIFICATIONS TO RFC 1795.......................   31











Bryant & Brittain            Informational                      [Page 2]

RFC 2166              APPN Implementer's Workshop             June 1997


   1. Introduction

   This document defines v2.0 of Data Link Switching (DLSw) in the form
   of a set of enhancements to RFC 1795. These enhancements are designed
   to be fully backward compatible with existing RFC 1795
   implementations. As a compatible set of enhancements to RFC 1795,
   this document does not replace or supersede RFC 1795.

   The bulk of these enhancements address scalability issues in DLSw
   v1.0.  Reason codes have also been added to the HALT_DL and
   HALT_DL_NOACK SSP messages in order to improve the diagnostic
   information available.

   Finally, the appendix to this document lists a number of
   clarifications to RFC 1795 where the implementation experience to-
   date has shown that the original RFC was ambiguous or unclear. These
   clarifications should be read alongside RFC 1795 to obtain a full
   specification of the base v1.0 DLSw standard.

2. HALT Reason codes

   RFC 1795 provides no mechanism for a DLSw to communicate to its peer
   the reason for dropping a circuit.  DLSw v2.0 adds reason code fields
   to the HALT_DL and HALT_DL_NOACK SSP messages to carry this
   information.

   The reason code is carried as 6 bytes of data after the existing SSP
   header.  The format of these bytes is as shown below.

   Byte       Description
   0-1        Generic HALT reason code in byte normal format

   2-5        Vendor-specific detailed reason code

   The generic HALT reason code takes one of the following decimal
   values (which are chosen to match the disconnect reason codes
   specified in the DLSw MIB).

   1 - Unknown error
   2 - Received DISC from end-station
   3 - Detected DLC error with end-station
   4 - Circuit-level protocol error (e.g., pacing)
   5 - Operator-initiated (mgt station or local console)

   The vendor-specific detailed reason code may take any value.






Bryant & Brittain            Informational                      [Page 3]

RFC 2166              APPN Implementer's Workshop             June 1997


   All V2.0 DLSws must include this information on all HALT_DL and
   HALT_DL_NOACK messages sent to v2.0 DLSw peers.  For backwards
   compatibility with RFC 1795, DLSw V2.0 implementations must also
   accept a HALT_DL or HALT_DL_NOACK message received from a DLSw peer
   that does not carry this information (i.e. RFC 1795 format for these
   SSP messages).

3. Scope of Scalability Enhancements

   The DLSw Scalability group of the AIW identified a number of
   scalability issues associated with existing DLSw protocols as defined
   in RFC 1795:

   - Administration

     RFC 1795 implies the need to define the transport address of all
     DLSw peers at each DLSw.  In highly meshed situations (such as
     those often found in NetBIOS networks), the resultant
     administrative burden is undesirable.

   - Address Resolution

     RFC 1795 defines point to point TCP (or other reliable transport
     protocol) connections between DLSw peers.  When attempting to
     discover the location of an unknown resource, a DLSw sends an
     address resolution packet to each DLSw peer over these connections.
     In highly meshed configurations, this can result in a very large
     number of packets in the transport network.  Although each packet
     is sent individually to each DLSw peer, they are each identical in
     nature.  Thus the transport network is burdened with excessive
     numbers of identical packets.  Since the transport network is most
     commonly a wide area network, where bandwidth is considered a
     precious resource, this packet duplication is undesirable.

   - Broadcast Packets

     In addition to the address resolution packets described above, RFC
     1795 also propagates NetBIOS broadcast packets into the transport
     network.  The UI frames of NetBIOS are sent as LAN broadcast
     packets.  RFC 1795 propagates these packets over the point to point
     transport connections to each DLSw peer.  In the same manner as
     above, this creates a large number of identical packets in the
     transport network, and hence is undesirable.  Since NetBIOS UI
     frames can be sent by applications, it is difficult to predict or
     control the rate and quantity of such traffic.  This compounds the
     undesirability of the existing RFC 1795 propagation method for
     these packets.




Bryant & Brittain            Informational                      [Page 4]

RFC 2166              APPN Implementer's Workshop             June 1997


   - TCP (transport connection) Overhead

     As defined in RFC 1795, each DLSw maintains a transport connection
     to its DLSw peers.  Each transport connection guarantees in order
     packet delivery.   This is accomplished using acknowledgment and
     sequencing algorithms which require both CPU and memory at the DLSw
     endpoints in direct proportion to the number transport connections.
     The DLSw Scalability group has identified two scenarios where the
     number of transport connections can become significant resulting in
     excessive overhead and corresponding equipment costs (memory and
     CPU).   The first scenario is found in highly meshed DLSw
     configurations where the number of transport connections
     approximates n2 (where n is the number of DLSw peers).  This is
     typically found in DLSw networks supporting NetBIOS.  The second
     scenario is found  in networks  where many remote locations
     communicate to few central sites.  In this case, the central sites
     must support n transport connections  (where n is the number of
     remote sites).    In both scenarios the resultant transport
     connection overhead is considered undesirable depending upon the
     value of n.

   - LLC2 overhead

     RFC 1795 specifies that each DLSw provides local termination for
     the LLC2 (SDLC or other SNA reliable data link  protocol) sessions
     traversing the SSP.   Because these reliable data links provide
     guaranteed in order packet delivery, the memory and CPU overhead of
     maintaining these connections can also become significant.   This
     is particularly undesirable in the second scenario described above,
     because the number of reliable connections maintained at the
     central site is the aggregate of the connections maintained at each
     remote site.

   It is not the intent of this document to address all the undesirable
   scalability issues associated with RFC 1795.  This paper identifies
   protocol enhancements to RFC 1795 using the inherent multicast
   capabilities of the underlying transport network to improve the
   scalability of RFC 1795.  It is believed that the enhancements
   defined, herein, address many of the issues identified above, such as
   administration, address resolution, broadcast packets, and, to a
   lesser extent, transport overhead.  This paper does not address LLC2
   overhead.  Subsequent efforts by the AIW and/or DLSw Scalability
   group may address the unresolved scalability issues.








Bryant & Brittain            Informational                      [Page 5]

RFC 2166              APPN Implementer's Workshop             June 1997


   While it is the intent of this paper to accommodate all transport
   protocols as best as is possible, it is recognized that the multicast
   capabilities of many protocols is not yet well defined, understood,
   or implemented. Since TCP is the most prevalent DLSw transport
   protocol in use today, the DLSw Scalability group has chosen to focus
   its definition around IP based multicast services. This document only
   addresses the implementation detail of IP based multicast services.

   This proposal does not consider the impacts of IPv6 as this was
   considered too far from widespread use at the time of writing.

4. Overview of Scalability Enhancements

   This paper describes the use of multicast services within the
   transport network to improve the scalability of DLSw based
   networking.  There are only a few main components of this proposal:

   - Single session TCP connections

     RFC 1795 defines a negotiation protocol for DLSw peers to choose
     either two unidirectional or one bi-directional TCP connection.
     DLSws implementing the enhancements described in this document must
     support and use(whenever required and possible)a single bi-
     directional TCP connection between DLSw peers. That is to say that
     the single tunnel negotiation support of RFC 1795 is a prerequisite
     function to this set of enhancements. Use of two unidirectional TCP
     connections is only allowed (and required)for migration purposes
     when communicating with DLSw peers that do not implement these
     enhancements.

     This document also specifies a faster method for bringing up a
     single TCP connection between two DLSw peers than the negotiation
     used in RFC 1795.  This faster method, detailed in section 6.2.1,
     must be used where both peers are known to support DLSw v2.0.

   - TCP connections on demand

     Two DLSw peers using these enhancements will only establish a TCP
     connection when necessary.  SSP connections to DLSw peers which do
     not implement these enhancements are assumed to be established by
     the means defined in RFC 1795.  DLSws implementing v2.0 utilize UDP
     based transport services to send address resolution packets
     (CANUREACH_ex, NETBIOS_NQ_ex, etc.).  If a positive response is
     received, then a TCP connection is only established to the
     associated DLSw peer if one does not already exist.
     Correspondingly, TCP connections are brought down when there are no
     circuits to a DLSw peer for an implementation defined period of
     time.



Bryant & Brittain            Informational                      [Page 6]

RFC 2166              APPN Implementer's Workshop             June 1997


   - Address resolution through UDP

     The main thrust of this paper is to utilize non-reliable transport
     and the inherent efficiencies of multicast protocols whenever
     possible and applicable to reduce network overhead.  Accordingly,
     the address resolution protocols of SNA and NetBIOS are sent over
     the non-reliable transport of IP, namely UDP.  In addition, IP
     multicast/unicast services are used whenever address resolution
     packets must be sent to multiple destinations. This avoids the need
     to maintain TCP SSP connections between two DLSw peers when no
     circuits are active.  CANUREACH_ex and ICANREACH_ex packets can be
     sent to all the appropriate DLSw peers without the need for pre-
     configured peers or pre-established TCP/IP connections.  In
     addition, most multicast services (including TCP's MOSPF, DVMRP,
     MIP, etc.) replicate and propagate messages only as necessary to
     deliver to all multicast members.   This avoids duplication and
     excessive bandwidth consumption in the transport network.

     To further optimize the use of WAN resources, address resolution
     responses are sent in a directed fashion (i.e., unicast) via UDP
     transport whenever possible.   This avoids the need to setup or
     maintain TCP connections when they are not required.  It also
     avoids the bandwidth costs associated with broadcasting.

     Note: It is also permitted to send some address resolution traffic
     over existing TCP connections.  The conditions under which this is
     permitted are detailed in section 7.

   - NetBIOS broadcasts over UDP

     In the same manner as above, NetBIOS broadcast packets are sent via
     UDP (unicast and multicast) whenever possible and appropriate. This
     avoids the need to establish TCP connections between DLSw peers
     when there are no circuits required.   In addition, bandwidth in
     the transport network is conserved by utilizing the efficiencies
     inherent to multicast service implementation.  Details covering
     identification of these packets and proper propagation methods are
     described in section 10.

5. Multicast Groups and Addressing

   IP multicast services provides an unreliable datagram oriented
   delivery service to multiple parties. Communication is accomplished
   by sending and/or listening to specific 'multicast' addresses.  When
   a given node sends a packet to a specific address (defined to be
   within the multicast address range), the IP network (unreliably)
   delivers the packet to every node listening on that address.




Bryant & Brittain            Informational                      [Page 7]

RFC 2166              APPN Implementer's Workshop             June 1997


   Thus, DLSws can make use of this service by simply sending and
   receiving (i.e., listening for) packets on the appropriate multicast
   addresses. With careful planning and implementation, networks can be
   effectively partitioned and network overhead controlled by sending
   and listening on different addresses groups.  It is not the intent of
   this paper to define or describe the techniques by which this can be
   accomplished.  It is expected that the networking industry (vendors
   and end users alike) will determine the most appropriate ways to make
   use of the functions provided by use of DLSw multicast transport
   services.

5.1 Using Multicast Groups

   The multicast addressing as described above can be effectively used
   to limit the amount of broadcast/multicast traffic in the network.
   It is not the intent of this document to describe how individual
   DLSw/SSP implementations would assign or choose group addresses.  The
   specifics of how this is done and exposed to the end user is an issue
   for the specific implementor.  In order to provide for multivendor
   interoperability and simplicity of configuration, however, this paper
   defines a single IP multicast address, 224.0.10.000, to be used as a
   default DLSw multicast address.  If a given implementation chooses to
   provide a default multicast address, it is recommended this address
   be used.  In addition, this address should be used for both
   transmitting and receiving of multicast SSP messages.  Implementation
   of a default multicast address is not, however, required.

5.2 DLSw Multicast Addresses

   For the purpose of long term interoperability, the AIW has secured a
   block of IP multicast addresses to be used with DLSw.  These
   addresses are listed below:

   Address Range        Purpose
   --------------------------------------------------------------------
   224.0.10.000         Default multicast address
   224.0.10.001-191     User defined DLSw multicast groups
   224.0.10.192-255     Reserved for future use by the DLSw RIG in DLSw
                        enhancements

6. DLSw Message Transports

   With the introduction of DLSw Multicast Protocols, SSP messages are
   now sent over two distinct transport mechanisms: TCP/IP connections
   and UDP services.  Furthermore, the UDP datagrams can be sent to two
   different kinds of IP addresses: unique IP addresses (generally
   associated with a specific DLSw), and multicast IP addresses
   (generally associated with a group of DLSw peers).



Bryant & Brittain            Informational                      [Page 8]

RFC 2166              APPN Implementer's Workshop             June 1997


6.1 TCP/IP Connections on Demand

   As is the case in RFC 1795, TCP/IP connections are established
   between DLSw peers.  Unlike RFC 1795, however, TCP/IP connections are
   only established to carry reliable circuit data (i.e., LLC2 based
   circuits).  Accordingly, a TCP/IP connection is only established to a
   given DLSw peer when the first circuit to that DLSw is required
   (i.e., the origin DLSw must send a CANUREACH_CS to a target DLSw peer
   and there is no existing TCP connection between the two).  In
   addition, the TCP/IP connection is brought down an implementation
   defined amount of time after the last active (not pending) circuit
   has terminated.  In this way, the overhead associated with
   maintaining TCP connections is minimized.

   With the advent of TCP connections on demand, the activation and
   deactivation of TCP connections becomes a normal occurrence as
   opposed to the exception event it constitutes in RFC 1795.  For this
   reason, it is recommended that implementations carefully consider the
   value of SNMP traps for this condition.

6.1.1 TCP Connections on Demand Race Conditions

   Non-circuit based SSP packetsn (e.g.,CANUREACH_ex, etc.) may still be
   sent/received over TCP connections after all circuits have been
   terminated.  Taking this into account implementations should still
   gracefully terminate these TCP connections once the connection is no
   longer supporting circuits.  This may require an implementation to
   retransmit request frames over UDP when no response to a TCP based
   unicast request is received and the TCP connection is brought down.
   This is not required in the case of multicast requests as these are
   received over the multicast transport mechanism.

6.2 Single Session TCP/IP Connections

   RFC 1795 defines the use of two unidirectional TCP/IP sessions
   between any pair of DLSw peers using read port number 2065 and write
   port number 2067.  Additionally, RFC 1795 allows for implementations
   to optionally use only one bi-directional TCP/IP session.  Using one
   TCP/IP session between DLSw peers is believed to significantly
   improve the performance and scalability of DLSw protocols.
   Performance is improved because TCP/IP acknowledgments are much more
   likely to be piggy-backed on real data when TCP/IP sessions are used
   bi-directionally.  Scalability is improved because fewer TCP control
   blocks, state machines, and associated message buffers are required.
   For these reasons, the DLSw enhancements defined in this paper
   REQUIRE the use of single session TCP/IP sessions.





Bryant & Brittain            Informational                      [Page 9]

RFC 2166              APPN Implementer's Workshop             June 1997


   Accordingly, DLSws implementing these enhancements must carry the TCP
   Connections Control Vector in their Capabilities Exchange.  In
   addition, the TCP Connections Control Vector must indicate support
   for 1 connection.

6.2.1 Expedited Single Session TCP/IP Connections

   In RFC 1795, single session TCP/IP connections are accomplished by
   first establishing two uni-directional TCP connections, exchanging
   capabilities, and then bringing down one of the connections.  In
   order to avoid the unnecessary flows and time delays associated with
   this process, a new single session bi-directional TCP/IP connection
   establishment algorithm is defined.

6.2.1.1 TCP Port Numbers

   DLSws implementing these enhancements will use a TCP destination port
   of 2067 (as opposed to RFC 1795 which uses 2065) for single session
   TCP connections.  The source port will be a random port number using
   the established TCP norms which exclude the possibility of either
   2065 or 2067.

6.2.1.2 TCP Connection Setup

   DLSw peers implementing these enhancements will establish a single
   session TCP connection whenever the associated peer is known to
   support this capability.  To do this, the initiating DLSw simply
   sends a TCP setup request to destination port 2067.  The receiving
   DLSw responds accordingly and the TCP three way handshake ensues.
   Once this handshake has completed, each DLSw is notified and the DLSw
   capabilities exchange ensues.  As in RFC 1795, no flows may take
   place until the capabilities exchange completes.

6.2.1.3 Single Session Setup Race Conditions

   The new expedited single session setup procedure described above
   opens up the possibility of a race condition that occurs when two
   DLSw peers attempt to setup single session TCP connections to each
   other at the same time.  To avoid the establishment of two TCP
   connections, the following rules are applied when establishing
   expedited single session TCP connections:

   1.If an inbound TCP connect indication is received on port 2067 while
     an outbound TCP connect request (on port 2067) to the same DLSw (IP
     address) is in process or outstanding, the DLSw with the higher IP
     address will close or reject the connection from the DLSw with the
     lower IP address.




Bryant & Brittain            Informational                     [Page 10]

RFC 2166              APPN Implementer's Workshop             June 1997


   2.To further expedite the process, the DLSw with the lower IP address
     may choose (implementation option) to close its connection request
     to the DLSw with the higher address when this condition is
     detected.
   3.If the DLSw with the lower IP address has already sent its
     capabilities exchange request on its connection to the DLSw with
     the higher IP address, it must resend its capabilities exchange
     request over the remaining TCP connection from its DLSw peer (with
     the higher IP address).
   4.The DLSw with the higher IP address must ignore any capabilities
     exchange request received over the TCP connection to be terminated
     (the one from the DLSw with the lower IP address).

6.2.1.4 TCP Connections with Non-Multicast Capable DLSw peers

   During periods of migration, it is possible that TCP connections
   between multicast capable and non-multicast capable DLSw peers will
   occur.  It is also possible that multicast capable DLSws may attempt
   to establish TCP connections with partners of unknown capabilities
   (e.g., statically defined peers).  To handle these conditions the
   following additional rules apply to expedited single session TCP
   connection setup:

   1.If the capability of a DLSw peer is not known, an implementation
     may choose to send the initial TCP connect request to either port
     2067 (expedited single session setup) or port 2065 (standard RFC
     1795 TCP setup).
   2.If a multicast capable DLSw receives an inbound TCP connect request
     on port 2065 while processing an outbound request on 2067 to the
     same DLSw, the sending DLSw will terminate its 2067 request and
     respond as defined in RFC 1795 with an outbound 2065 request
     (standard RFC 1795 TCP setup).
   3.If a multicast capable DLSw receives an indication that the DLSw
     peer is not multicast capable (the port 2067 setup request times
     out or a port not recognized rejection is received), it will send
     another connection request using port 2065 and the standard RFC
     1795 session setup protocol.














Bryant & Brittain            Informational                     [Page 11]

RFC 2166              APPN Implementer's Workshop             June 1997


6.3 UDP Datagrams

   As mentioned above, UDP datagrams can be sent two different ways:
   unicast (e.g., sent to a single unique IP address) or multicast
   (i.e., sent to an IP multicast address).  Throughout this document,
   the term UDP datagram will be used to refer to SSP messages sent over
   UDP, while unicast and multicast SSP messages will refer to the
   specific type/method of UDP packet transport.  In either case,
   standard UDP services are used to transport these packets.  In order
   to properly parse the inbound UDP packets and deliver them to the SSP
   state machines, all DLSw UDP packets will use the destination port of
   2067.

   In addition, the checksum function of UDP remains optional for DLSw
   SSP messages.  It is believed that the inherent CRC capabilities of
   all data link transports will adequately protect SSP packets during
   transmission.  And the incremental exposure to intermediate nodal
   data corruption is negligible.  For further information on UDP packet