T-RACKs: A Faster Recovery Mechanism for TCP in Data Center Networks

Ahmed M. Abdelmoniem, Brahim Bensaou

Research output: Contribution to journalArticlepeer-review

Abstract

Cloud interactive data-driven applications generate swarms of small TCP flows that compete for the small switch buffer space in data-center. Such applications require a small flow completion time (FCT) to be effective. Unfortunately, TCP is myopic with respect to the composite nature of application data. In addition it tends to artificially inflate the FCT of individual flows by several orders of magnitude, because of its Internet-centric design, that fixes the retransmission timeout (RTO) to be at least hundreds of milliseconds. To better understand this problem, in this paper, we use empirical measurements in a small data center testbed to study, at a microscopic level, the effects of various types of packet losses on TCP's performance. In particular, we single out packet losses that impact the tail end of small flows, as well as bursty losses that span a significant fraction of small TCP congestion windows, and show a non-negligible effect of such losses on the FCT. Based on this, we propose the so-called, timely-retransmitted ACKs (or T-RACKs), a simple loss recovery mechanism that conceals the drawbacks of the long RTO even in the presence of heavy packet losses. Interestingly enough, T-RACKS achieves this transparently to TCP itself as it does not require any change to TCP in the tenant's virtual machine (VM) or container. T-RACKs can be implemented as a software shim layer in the hypervisor between the VMs and the server's NIC or in hardware as a networking function in a SmartNIC. Simulation and real testbed results show remarkable performance improvements.
Original languageEnglish (US)
Pages (from-to)1-14
Number of pages14
JournalIEEE/ACM Transactions on Networking
DOIs
StatePublished - 2021

Fingerprint

Dive into the research topics of 'T-RACKs: A Faster Recovery Mechanism for TCP in Data Center Networks'. Together they form a unique fingerprint.

Cite this