Network Working Group Q. Xiong Internet-Draft ZTE Corporation Intended status: Informational K. Yao Expires: 8 June 2025 China Mobile C. Huang China Telecom Z. Han China Unicom J. Zhao CAICT 5 December 2024 Problem Statement for High Performance Wide Area Networks draft-xiong-hpwan-problem-statement-00 Abstract High Performance Wide Area Network (HP-WAN) is designed for many applications such as scientific research, academia, education and other data-intensive applications which demand large volume data transmission over WANs, and it needs to ensure large-scale data processing and provide efficient transmission services. This document outlines the problems for HP-WANs. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 8 June 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. Xiong, et al. Expires 8 June 2025 [Page 1] Internet-Draft Problems Statement for High Performance December 2024 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. High-performance Goals for HP-WANs . . . . . . . . . . . . . 4 4. Problem Statements . . . . . . . . . . . . . . . . . . . . . 6 4.1. Long-distance Delay and Slow Feedback . . . . . . . . . . 6 4.2. Coarse-grained Exploitation of Network Capacities . . . . 7 4.3. Instantaneous Traffic . . . . . . . . . . . . . . . . . . 8 4.4. Incast Congestion upon Bottleneck Links . . . . . . . . . 8 5. Security Considerations . . . . . . . . . . . . . . . . . . . 9 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 8.1. Normative References . . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 1. Introduction As described in [I-D.kcrh-hpwan-state-of-art], data is fundamental for research, academia, education, industrial and other data- intensive applications, such as High Performance Computing (HPC) for scientific research, cloud storage and backup of industrial internet data, distributed training of Artificial Intelligence (AI), and so on. Within these applications, they may generate huge volumes of data by using advanced instruments and high-end computing devices. It needs to ensure large-scale data transfer within a completion time and provide stable and efficient transmission services over non- dedicated Wide Area Networks (WANs). These WANs need to connect research institutions, universities, and data centers across large geographical areas, and it usually requires massive data transmission over long-distance links. For example, sharing data between research institutes must transfer over hundreds or thousands of kilometers. Moreover, some applications may demand a periodic and on-demand migration with variable transmission frequency, requiring timely data transmission. The large data transfer co-existed services over WANs demand high performance, such as effective high-throughput, fairness among multiple services, and high network utilization. Xiong, et al. Expires 8 June 2025 [Page 2] Internet-Draft Problems Statement for High Performance December 2024 More recently, the massive data transmission and long-distance connection over complicated WANs have become a key factor affecting the performance of existing technologies. For example, the high- volume data may be transmitted over WANs, which depends on the transport layer protocols such as Transfer Control Protocol (TCP), Quick UDP Internet Connections (QUIC), Remote Direct Memory Access (RDMA) and so on. The traditional congestion control mechanisms can not achieve the high performance, which are typically implemented at the host (sender and receiver) to control or prevent the congestion. For the host, it may adjust sending rates based on the feedback from the network when the packet loss or congestion occurred. But it will impact the performance with the long feedback loop and it could also be inefficient without the fine-grained awareness of network capability. For the network, it always reactively transfers the packets leading to low bandwidth utilization due to the bottleneck link and instantaneous congestion. For example, the network could enhance the capability to regulate the traffic to avoid incast network congestion preemptively and it could also be actively collaborated with the host to adjust the rate efficiently and rapidly when congestion occurred. The negotiation between the host and the network is required to assist the network operator's traffic management and bandwidth allocation and utilization optimization and help the host to adjust the rate with the network resource scheduling acknowledgement. So the host with sophisticated congestion control upon more active network coordination should be considered to improve overall HP-WANs transmission performance. High Performance Wide Area Network (HP-WAN) is designed specifically to meet the high-speed, low-latency, and high-capacity needs of massive data set applications, which puts forward high performance requirements such as effective high-throughput, multiple service fairness and high bandwidth utilization. This document outlines the problems for HP-WANs. 1.1. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2. Terminology The terminology is defined as following. Xiong, et al. Expires 8 June 2025 [Page 3] Internet-Draft Problems Statement for High Performance December 2024 High Performance Wide Area Networks (HP-WANs): indicate the wide area networks designed specifically to meet the high-speed, low-latency, and high-capacity needs of research, academia, education, industrial and other data-intensive applications. The primary goal of HP-WAN is to achieve massive data transmission within a completion time, which puts forward high performance requirements such as effective high- throughput, multiple service fairness and high bandwidth utilization. It also makes use of the following abbreviations and definitions in this document: DC: Data Center DCI: Data Centers Interconnection HPC: High Performance Computing WAN: Wide Area Networks MAN: Metropolitan Area Networks PFC: Priority Flow Control ECN: Explicit Congestion Notification ECMP: Equal-Cost Multipath RTT: Round-Trip Time TCP: Transfer Control Protocol RDMA: Remote Direct Memory Access QUIC: Quick UDP Internet Connections 3. High-performance Goals for HP-WANs The services need to be provided in HP-WANs mainly focus on massive data with timely transmission while multiple services may co-exist over long-distance networks as described below. * Massive data transmission, bulk or high-volume data transfer, e.g. the data volume of a flow could be at 2Gbps~1Tbps. * Timely data transmission, it has a completion time but without strict real-time transmission requirements, e.g. minutes~milliseconds. Xiong, et al. Expires 8 June 2025 [Page 4] Internet-Draft Problems Statement for High Performance December 2024 * Predictable transmission, the transmission frequency is variable and predictable, e.g. a periodic or on-demand migration migration. * Long-distance transmission over non-dedicated WANs, between one or more sites or DCs, e.g.more than 100km or 1000km. * Multiple services are co-existed with concurrent flows. * Minimize cost. * Data security and integrity. From the application perspective, it is required to achieve effective high-throughput data transmission for an HP-WAN flow to meet a completion time. Moreover, it is also crucial to maximize bandwidth utilization while ensuring fairness among multiple services. This document outlines the high-performance requirements for HP-WANs as described below. * Effective high-throughput: HP-WANs put forward high performance requirements for the throughput of high-volume data transmission within a completion time over WANs. It will be impacted by the performance indicators such as bandwidth, packet loss ratio, latency and so on, for example, the packet loss and RTT are negatively correlated with throughput. It is required to achieve ultra-high goodput, ultra-low packet loss ratio, low latency and resilience to ensure effective high-throughput transmission in HP- WANs. * Multiple services fairness: HP-WANs put forward high performance requirements for fairness when multiple services are co-existed with concurrent flows. It refers to ensuring that different types of services can obtain reasonable resources and services in network resource allocation and management in order to meet their respective quality of service (QoS) requirements, while ensuring the fairness of resource allocation. * Ultra-high bandwidth utilization: HP-WANs put forward high performance requirements for the bandwidth utilization of the network. It needs to efficiently use available network capacity to maximize data transfer rates and minimize latency to achieve the low cost in HP-WANs. It is required to achieve bandwidth utilization rate exceeding 90% to ensure that network resources are fully utilized. Xiong, et al. Expires 8 June 2025 [Page 5] Internet-Draft Problems Statement for High Performance December 2024 4. Problem Statements It will be challenging to provide effective high-performance transmission in HP-WANs scenarios with massive concurrent services and long-distance delays and packet loss. The long-distance networks may have more uncertainties, such as long Round-Trip Time (RTT) latency, routing changes, network congestion, packet loss and link quality fluctuations, all of which may have a negative impact on the throughput. The services are massive and concurrent with multiple types and different traffic models such as the elephant flows with short interval time, high speed and large data scale, which may occupy a large amount of network resources and lead to the unfairness among different flows, low network utilization and cost- effectiveness. The existing network technologies have various problems and cannot meet the performance requirements. This document outlines the problems for HP-WANs. 4.1. Long-distance Delay and Slow Feedback Several congestion control algorithms are implemented such as loss- based congestion control algorithms (e.g. Reno and CUBIC, it depends the congestion notification with packet loss) and congestion-based congestion control algorithms (e.g. BBR, it depends on the measurement of congestion). It will delay the network state feedback due to the long-distance transmission delays and large RTT, resulting in the inability to adjust the transmission rate in a timely manner. It will be challenging for congestion control in WANs for controlling the total amount of data entering the network to maintain the traffic at an acceptable level. Feedback should be independent of the transmission distance, and as timely as possible. For example, Explicit Congestion Notification (ECN) can be used for Reno and CUBIC to achieve an end-to-end congestion notification based on IP and transport layers. When a congestion occurred, the network may signal congestion by ECN markings or by dropping packets, and the receiver passes this information back to the sender in transport- layer acknowledgements, notifying the source to adjust the transmission rate to achieve congestion control. The long-distance will delay the notification and slow the feedback, which result in untimely adjustment and buffer overflow, causing a decrease in network performance. Especially for incast congestion based on multi-source targeting, the network needs to send a fast feedback based on offered load. Xiong, et al. Expires 8 June 2025 [Page 6] Internet-Draft Problems Statement for High Performance December 2024 For BBR, it actively measures bottleneck bandwidth (BtlBw) and round- trip propagation time (RTprop) based on the model to calculate the bandwidth delay product (BDP) and then to adjust the transmission rate to maximize throughput and minimize latency. But BBR relies on real-time measurement of the parameters which may vary greatly, feedback slowly, thereby affecting the control precision of BBR in long-distance networks. Moreover, other congestion control algorithms such as the Data Center Quantized Congestion Notification (DCQCN) and High Precision Congestion Control (HPCC++) would not tolerate the slow feedback loop over WANs. 4.2. Coarse-grained Exploitation of Network Capacities The existing congestion control mechanisms focus on rate adjustment, which can control the sending rate of data flows at the source of data transmission, thereby avoiding or reducing network congestion. It will be challenging for the host to adjust the sending rates efficiently without the awareness of network capacity. For example, for CUBIC, as per [RFC9438], when the packet loss is detected using classic ECN mechanism, it will reduce the congestion window based on its multiplicative window decrease factor, that will adjust the sending rate with sawtooth pattern. And for L4S as per [RFC9330], it uses more frequent ECN tagging to provide low latency and scalable throughput and to reduce the convergence time and eliminate the sawtooth effect. However, due to ECN feedback of congestion and frequent rate adjustment, it will result in significant changes in throughput, which affects bandwidth utilization and transmission efficiency. It still lack more accurate network information which is critical for significant transmission capacity gaps between the appropriate sending rate and the available network capacity especially when transmitting the high-volume data over WANs . Moreover, it incurs inconsistency between the sending rate of the host and the network transmission capability to achieve accurate sending rate adjusting. For example, when determining the starting rate of data transmission, the slow start in congestion control will lead to overall throughput bottleneck with insufficient bandwidth utilization and fail to fully unleash the potential of the network capacity. But the fast start can not adapt to the cache capacity of network devices especially when multiple flows are transmitted over the same link, causing network congestion and resulting in packet loss and transmission delay. For HP-WANs, the fine-grained network- aware sending rate negotiation needs to comprehensively consider factors such as predictable network bandwidth, latency, packet loss rate, while balancing bandwidth utilization and congestion avoidance in WANs. Xiong, et al. Expires 8 June 2025 [Page 7] Internet-Draft Problems Statement for High Performance December 2024 4.3. Instantaneous Traffic From the network perspective, it can just reactively transfer the high-volume data without scheduling the predictable traffic and network resources to estimate network congestion preemptively. It will be challenging for the network without the awareness of instantaneous traffic which will occupy a large amount of network resources, resulting in low bandwidth utilization due to the uneven resource allocation. For example, in HP-WAN applications, a large amount of data will be transmitted, e.g. the data volumes of a single flow may be from 10G to 1TB, the massive data transferring with large burst may cause instantaneous congestion, packet loss, and queuing delay within network devices in WANs. There will be more aggregations at the edge of WANs and it may be accumulated as the flows traverse, join, and separate over hops. It will be challenging for unmanageable congestion control for the bursty traffic. Moreover, goodput bottleneck with transmission completion time and duration brings traffic scheduling challenging. The applications may have multiple concurrent services co-existed with existing dynamic flows. Considering the multiple services with various types and different traffic requirements, the traffic is required to be scheduled to multiple paths and fine-grained network resources to achieve high utilization and QoS guarantee. 4.4. Incast Congestion upon Bottleneck Links It will be challenging for incast congestion causing by bottleneck links bandwidth in long-distance and multi-hop networks. And it will be difficult to control packet loss, queuing latency and jitter leading to the decrease of throughput. Incast traffic is the mastermind of congestion for the greedy transmission. The network may regulate them to avoid congestion preemptively. It may proactively avoid the path-level congestion and operate actively reserving and allocating network bandwidth through a scheduler to match the bottleneck link bandwidth as much as possible, thus fully utilizing bandwidth and preventing packet loss. Moreover, the congestion in the network can be reduced, thereby reducing packet loss caused by buffer overflow, through effective flow control which refers to a method for ensuring the data is transmitted efficiently and reliably and controlling the rate of data transmission to prevent the fast sender from overwhelming the slow receiver and prevent packet loss in congested situations. But it will be challenging to ensure the fairness among multiple services over different distances due to the unequal allocation of network Xiong, et al. Expires 8 June 2025 [Page 8] Internet-Draft Problems Statement for High Performance December 2024 resources among flows with different RTTs. For example, some flows may occupy more bandwidth due to the use of large window sizes, smaller RTTs, or larger packets. 5. Security Considerations This document covers several of representative applications and network scenarios that are expected to make use of HP-WAN technologies. Each of the potential use cases does not raise any security concerns or issues, but may have security considerations from both the use-specific perspective and the technology-specific perspective. 6. IANA Considerations This document makes no requests for IANA action. 7. Acknowledgements The authors would like to acknowledge Guangping Huang, Yao Liu and Zheng Zhang for their thorough review and very helpful comments. 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, . [RFC7424] Krishnan, R., Yong, L., Ghanwani, A., So, N., and B. Khasnabish, "Mechanisms for Optimizing Link Aggregation Group (LAG) and Equal-Cost Multipath (ECMP) Component Link Utilization in Networks", RFC 7424, DOI 10.17487/RFC7424, January 2015, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . Xiong, et al. Expires 8 June 2025 [Page 9] Internet-Draft Problems Statement for High Performance December 2024 [RFC8664] Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W., and J. Hardwick, "Path Computation Element Communication Protocol (PCEP) Extensions for Segment Routing", RFC 8664, DOI 10.17487/RFC8664, December 2019, . [RFC9232] Song, H., Qin, F., Martinez-Julia, P., Ciavaglia, L., and A. Wang, "Network Telemetry Framework", RFC 9232, DOI 10.17487/RFC9232, May 2022, . [RFC9330] Briscoe, B., Ed., De Schepper, K., Bagnulo, M., and G. White, "Low Latency, Low Loss, and Scalable Throughput (L4S) Internet Service: Architecture", RFC 9330, DOI 10.17487/RFC9330, January 2023, . [RFC9438] Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed., "CUBIC for Fast and Long-Distance Networks", RFC 9438, DOI 10.17487/RFC9438, August 2023, . Authors' Addresses Quan Xiong ZTE Corporation China Email: xiong.quan@zte.com.cn Kehan Yao China Mobile China Email: yaokehan@chinamobile.com Cancan Huang China Telecom China Email: huangcanc@chinatelecom.cn Zhengxin Han China Unicom China Email: hanzx21@chinaunicom.cn Xiong, et al. Expires 8 June 2025 [Page 10] Internet-Draft Problems Statement for High Performance December 2024 Junfeng Zhao CAICT Beijing China Email: zhaojunfeng@caict.ac.cn Xiong, et al. Expires 8 June 2025 [Page 11]