Network Working Group M. Bagnulo Internet-Draft UC3M Intended status: Experimental Y. Nishida Expires: April 2, 2018 GE Global Research September 29, 2017 TCP ESN: Extended Sequence Numbers for TCP draft-bagnulo-tcpm-esn-00.txt Abstract This note defines the Extended Sequence Number (ESN) experimental modification to TCP to increase TCP's sequence number using the TimeStamp (TS) option. It also modifies the Window Scale (WS) option to support larger receiver window enable by the extended sequence number space. At this stage, the purpose of this document is to discuss different design choices to generate discussion about the approach to follow. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 2, 2018. Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must Bagnulo & Nishida Expires April 2, 2018 [Page 1] Internet-Draft TCP ESN September 2017 include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Design rationale . . . . . . . . . . . . . . . . . . . . . . 3 2.1. Reduced option space consumption in the SYN and graceful fallback . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2. Deployability . . . . . . . . . . . . . . . . . . . . . . 4 3. RTTM With Extended Sequence Number Prefix . . . . . . . . . . 4 4. Middleboxes Implications . . . . . . . . . . . . . . . . . . 7 5. SACK for Extended Sequence Number . . . . . . . . . . . . . . 8 6. Impacts On Other TCP Extensions . . . . . . . . . . . . . . . 8 6.1. PAWS . . . . . . . . . . . . . . . . . . . . . . . . . . 8 6.2. Eifel Detection Algorithm . . . . . . . . . . . . . . . . 9 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 10.1. Normative References . . . . . . . . . . . . . . . . . . 9 10.2. Informative References . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 1. Overview The proposed Extended Sequence Number (ESN) mechanism re-purposes the TS option [RFC7323] to carry a prefix for the sequence number and a prefix for the Acknowledgement number, increasing the sequence number used in TCP connections. As currently defined, the TS option contains two 32-bit fields, TSval and TSecr. The current ESN proposal re-defines TSval to carry a prefix for the sequence number and TSecr to carry a prefix for the Acknowledgment number. In this way, the actual sequence number corresponding to the first data byte contained in the segment would the the concatenation of the value contained in the TSval and the value of the Sequence Number field of the TCP header. The Acknowledgment sequence number would be the concatenation of the value contained in the TSecr and the value of the Acknowledgment Number field of the TCP header. The proposed ESN mechanism also modifies the WS option as follows: First, values up to 46 are allowed (enabling a RCV window up to 2^62). These are encoded in the 6 less significant bits of the shift.count. Second, the remaining two (most significant) bits are turned into flags. In particular, the most significant bit is used Bagnulo & Nishida Expires April 2, 2018 [Page 2] Internet-Draft TCP ESN September 2017 as the ESN flag to indicate the ESN support in the connection. Specifically, when the ESN bit is set to 1 in the WS carried in a SYN or a SYN/ACK, it means that: i) the TS option is being used for extended sequence numbers, as defined above, and ii) that the sender of the WS option with the ESN bit set supports receiver window up to 2^62 in this connection. The ESN flag defined this way allows endpoints to express and negotiate ESN support during the TCP 3-way handshake. The sequence number of a TCP segment using ESN is the result of prepending the prefix carried in the TS Value and the sequence number contained in the Sequence Number field of the TCP header. Similarly, the ACK number is the result of prepending the value in the TS Echo Reply value and the value in the ACK field of the TCP header. When a client wants to use the extended sequence number for a new connection, it sends a SYN with both the TS and the WS options. In the WS option, it sets the ESN flag to inform that it wants to use ESN for this connection. It encodes the most significant bits of the sequence number in the TS Value and the remaining bits of the extended sequence number in the sequence number field in the TCP header. Since the ACK flag is not set in the TCP header of the SYN packet, the TS Echo Value is set to zero (as defined in [RFC7323]). If the server also supports the extended sequence number mechanism, the server replies with a SYN/ACK carrying both the TS and WS options. In the WS option it sets the ESN flag to confirm the ESN support. It encodes the prefix of its own extended sequence number in the TS Value and the prefix of the ACK in the TS Echo Reply. If the server does not support ESN, it will respond with a SYN/ACK containing a WS option carrying a value lower then 14 i.e. with the most significant bit set to 0. It may also include the TS option indicating its willingness to use timestamps as defined in RFC7323 in this connection. Upon the reception of the SYN/ACK, the client can gracefully fall back to use TS are defined in RFC7323, in particular, PAWS can be used. 2. Design rationale Our proposal is to re-utilize the TCP TS option to carry a sequence number offset in addition to the existing 32 bits sequence number. This approach is similar to [I-D.looney-tcpm-64-bit-seqnos] although it has distinct difference. while [I-D.looney-tcpm-64-bit-seqnos] proposes to allocate a new TCP option, we propose to utilize existing TS option instead. We believe this approach will have the following advantages. Bagnulo & Nishida Expires April 2, 2018 [Page 3] Internet-Draft TCP ESN September 2017 2.1. Reduced option space consumption in the SYN and graceful fallback The maximum size of the TCP header (including options) is 60 bytes (this is because the Data Offset field of the TCP header is 4 bits and can expresses the offset in 32-bit words). Since the TCP basic header is 20 bytes, a segment can carry 40 bytes of options at most. This is particularly pressing for the TCP SYN and TCP SYN/ACK packets. Currently, there is a fair number of options that are frequently carried in SYN packets, especially in high performance communications. In particular, the MSS option (2 bytes) [RFC0793], the SACK permitted option (2 bytes)[RFC2018], the Window Scale option (3 bytes) and the TimeStamp option (used for PAWS) (10 bytes) [RFC7323]. All these options account for 17 bytes. The are other options that are becoming increasingly popular. For instance, The option length of TCP Fast Open (TFO) [RFC7413] is 6 bytes or 18 bytes depending on the length of the cookie used. There are other options that require SYN and SYN/ACK option space such as MP_CAPABLE in [RFC6824], or TCP-AO [RFC5925]. This means that for instance, a TCP client that would like to initiate a connection including the MSS option, SACK permitted option the WS and TS options and also carry a TFO option would not have room to carry an additional 10 byte long option for the extended sequence number. Since our approach utilizes TS option, additional option space for extended sequence number is not needed. The proposed ESN approach allows for using the extended sequence number if both endpoints support it while enabling graceful fall- back. A client supporting ESN would include the TS option and set the flag in the WS option indicating the ESN support. If the server does not support ESN, the connection can still be established using 32 bit sequence numbers and the TS and WS options as defined in RFC7323 (in particular PAWS can be used in the connection). 2.2. Deployability [HONDA11] reported that unknown options in the SYN prone to be removed with higher probability than known options. Hence, we believe utilizing existing options will have better chances to avoid unwanted middleboxes' interferences. Although it would be useful to perform some other measurements specifically about how frequently the TS option is removed. 3. RTTM With Extended Sequence Number Prefix [RFC7323] defined two uses for the TS option: PAWS and RTTM. When re-purposing the TS option for ESN, we argue that the use of TS for carrying extended sequence number subsumes the uses of PAWS. Bagnulo & Nishida Expires April 2, 2018 [Page 4] Internet-Draft TCP ESN September 2017 However, this is not the case for RTTM. We identify the following alternatives in order to archive RTTM when re-purposing the TS option for ESN. Option 1: This approach uses the most significant bit (MSB) of both TSval and TSecr as a flag as depicted in Figure 1. If the MSB is set to 1, it means the field contained a sequence number prefix. If it is reset, it means that it contains a timestamp. This means that we use 31 bits for the extended sequence number prefix, resulting in 63 bit long sequence numbers. The main problem here is that the segments containing the timestamp lack the sequence number prefix information. So, for instance, it is not possible to have more that 2^32 bytes in flight if any of the segments in flight is carrying and actual timestamp, since there is the possibility of confusion (in particular is the receive window is large enough to accommodate two packets with the same 32 bit sequence number, then the receiver would not be able to figure out the right place for the packet that carries the timestamp and does not carry the sequence number prefix). So, if we want to use this option, the receiver window cannot be larger than 2^32. However, this restriction does not address all the problems. If a duplicated packet carrying a timestamp in the TS option gets delay one RTT or more and the 32 bit sequence number wraps around, then the receiver can potentially take this old duplicated packet for a new packet with the same sequence number suffix. It would be possible to rely on PAWS for detecting and eliminating this packets. However, in order for PAWS to be used, it is necessary to keep the timestamp information stored in TS.recent updated. This requires that at least a few actual timestamps are exchanged every 2^31 sequence numbers. Summarizing, the constraints to use this option are first that the light-size is less than 2^32 and that at least n (n=4?) timestamps are exchanged every 2^32 bytes of data. We believe this is poor alternative, especially due to the flight-size constraint. +-------+-------+-+---------------------+-+----------------------+ |Kind=8 | 10 |F| TSval or Prefix |F| TSecr or Prefix | +-------+-------+-+---------------------+-+----------------------+ 8 8 1 31 1 31 Figure 1: Time Stamp Option format for Option 1 Option 2: Bagnulo & Nishida Expires April 2, 2018 [Page 5] Internet-Draft TCP ESN September 2017 This approach uses the TSecr in some packets to exchange timestamps. The idea here is that all data segments carry the extended sequence number prefix in the TSval but that some packets do not carry ACK information, which is acceptable because we use cumulative ACKs as long as this only affects a few packets (e.g. one packet per RTT do not carry ACK information). In order to enable both uses of the TSecr (timestamp or sequence number prefix), we need to use 2 bits to encode whether the TSecr carries either an extended sequence number prefix for the ACK, a timestamp or a timestamp echo. This implies that there are 30 bits left in TSecr for the actual value, resulting in 30 bit timestamps and 62 bit sequence numbers The receiver of a packet carrying the TS option carrying an actual timestamp or timestamp echo should discard the ACK information since it cannot know the the prefix of the seq number carried in the ACK field. This option seems a reasonable trade-off. If this option is adopted, RTTM could only be used sporadically. However, this may not be a concern, since it is likely that it would be possible to measure the RTT at least once every RTT which is likely to be enough for estimating the RTT for the RTO calculation (see [RFC7323] for further details). +-------+-------+--+--------------------+--+---------------------+ |Kind=8 | 10 |F | TSval or Prefix |F | TSecr or Prefix | +-------+-------+--+--------------------+--+---------------------+ 8 8 2 30 2 30 Figure 2: Time Stamp Option format for Option 2 Option 3: This approach splits the TSval and the TSecr into two 16-bit fields resulting in 16 bit timestamps and 48 bit sequence numbers. 48 bit sequence numbers are a significant improvement from the current 32 bit sequence numbers, so it is probably enough. It is possible to encode the timestamp information using 16 bits. For example, [I-D.trammell-tcpm-timestamp-interval] proposes to encode timestamp information using 16 bits, which could be used in this option. Bagnulo & Nishida Expires April 2, 2018 [Page 6] Internet-Draft TCP ESN September 2017 +-------+-------+-----------+-----------+------------+-----------+ |Kind=8 | 10 | Prefix | TSval | Prefix | TSecr | +-------+-------+-----------+-----------+------------+-----------+ 8 8 16 16 16 16 Figure 3: Time Stamp Option format for Option 3 Option 4: This approach Only uses the TS for one single purpose per connection either the original purpose or ESN. This will be less attractive because the RTTM cannot be used with ESN in the same connection. +-------+-------+-----------------------+------------------------+ |Kind=8 | 10 | Prefix | Prefix | +-------+-------+-----------------------+------------------------+ 8 8 32 32 Figure 4: Time Stamp Option format for Option 4 Based on the observations above, we believe option 2 and 3 would be worth for further discussions while option 1 and 4 can be discarded due to major drawbacks. 4. Middleboxes Implications It has been observed in [HONDA11] that some middleboxes insert the TS Option. Also, there may be boxes out there that modify the sequence number, while not terminating the connection. In order to detect these cases that would break the proposed mechanism, it would be beneficial to add an extra safety measure requiring that the prefix encoded in the TS Option replicates the most significant bits of the value included in the Sequence number field. In this way, a server supporting the extended sequence number mechanism cannot only verify the flag in the WS option, but also check if the TS value matches with the 31 most significant bits in the Sequence Number field in the TCP header. If they do not match, the server should not negotiate the use of the extended sequence number mechanism (i.e. it replies with the WS option resetting the flag for the extended sequence number mechanism). This is adopted from [I-D.looney-tcpm-64-bit-seqnos]. In case that the server is a legacy server, it will reply without the WS option or with the WS option with a shift.count value lower than Bagnulo & Nishida Expires April 2, 2018 [Page 7] Internet-Draft TCP ESN September 2017 15. In this case, the client falls back to regular TCP without the extended sequence number and regular timestamps. 5. SACK for Extended Sequence Number In the case of SACK blocks, there are two possible complementary approaches: 1. we use the currently defined SACK options identifying bits using 32 bit sequence numbers. These are used in a connection that has successfully negotiated ESN, the prefix carried in the TSecr of the message applies also to the sequence numbers identifying the SACK blocks. The limitation of such approach is that all SACK blocks in a single SACK option must use to the same prefix, which prevents from SACKing older blocks. However, it is not certain that if we really need to report wide range of SACK blocks in a single SACK option. Another issue would be the case where a SACK option is detached from the original packet and attached to a different one. One possible mitigation for this would be discarding SACK info in case of suspicious as SACK is optional info and a SACK info usually is carried in multiple ACKs. 2. define a new SACK block option for extended sequence numbers as proposed in [I-D.looney-tcpm-64-bit-seqnos]. There are a couple of observations regarding the last option using the new SACK block option. First, note that the currently SACK permitted option could still be used. Hence, if a connection negotiated both SACK and ESN, we may presume that it supports the new SACK block option. If the ESN negotiation fails, it means that 32-bit SACK are to be used for that connection, providing graceful fallback. 6. Impacts On Other TCP Extensions Since this proposal repurpose the existing use of timestamp option, some other proposals that use the option will be affected. We investigated the impacts on the following TCP extensions and propose modifications to make them work with the proposal. 6.1. PAWS In order to perform PAWS, receives need to check if the timestamp option in an arrived packet contains sequence number prefix or timestamp info by checking the most significant bit. If it contains timestamp info, it process the timestamp info as described Section 5.3 in [RFC7323]. If it contains sequence number prefix, it can know the extended sequence number of the packet based on the Bagnulo & Nishida Expires April 2, 2018 [Page 8] Internet-Draft TCP ESN September 2017 into. If the extended sequence number is outside of the window, the packet will be discarded as PAWS. 6.2. Eifel Detection Algorithm If Eifel detection algorithm [RFC3522] is activated, senders performs the logics described in Section 3.2 of [RFC3522] with the following two modifications. First, TCP sender MUST set timestamp info when it retransmit packets. Second, if TCP sender receives the ACK with sequence number prefix for the retransmitted packet, it should treat as if the timestamp is smaller than the value of RetransmitTS. 7. Acknowledgments 8. Security Considerations 9. IANA Considerations 10. References 10.1. Normative References [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, DOI 10.17487/RFC0793, September 1981, . [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. Scheffenegger, Ed., "TCP Extensions for High Performance", RFC 7323, DOI 10.17487/RFC7323, September 2014, . 10.2. Informative References [HONDA11] Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., Handley, M., and H. Tokuda, "Is it still possible to extend TCP?", ACM IMC 2011, 2011. [I-D.looney-tcpm-64-bit-seqnos] jlooney@juniper.net, j., "64-bit Sequence Numbers for TCP", draft-looney-tcpm-64-bit-seqnos-00 (work in progress), March 2017. [I-D.trammell-tcpm-timestamp-interval] Scheffenegger, R., Kuehlewind, M., and B. Trammell, "Encoding of Time Intervals for the TCP Timestamp Option", draft-trammell-tcpm-timestamp-interval-01 (work in progress), July 2013. Bagnulo & Nishida Expires April 2, 2018 [Page 9] Internet-Draft TCP ESN September 2017 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP Selective Acknowledgment Options", RFC 2018, DOI 10.17487/RFC2018, October 1996, . [RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for TCP", RFC 3522, DOI 10.17487/RFC3522, April 2003, . [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP Authentication Option", RFC 5925, DOI 10.17487/RFC5925, June 2010, . [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, "TCP Extensions for Multipath Operation with Multiple Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, . [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, . Authors' Addresses Marcelo Bagnulo UC3M Email: marcelo@it.uc3m.es Yoshifumi Nishida GE Global Research 2623 Camino Ramon San Ramon, CA 94583 USA Email: nishida@wide.ad.jp Bagnulo & Nishida Expires April 2, 2018 [Page 10]