[2021.09.30] Swift:...
 
알림
모두 지우기

[2021.09.30] Swift: Delay is Simple and Effective for Congestion Control in the Datacenter

(@jhsong)
글: 23
회원
주제 스타터
 

Title: Swift: Delay is Simple and Effective for Congestion Control in the Datacenter

 

Abstract: We report on experiences with Swift congestion control in Google datacenters. Swift targets an end-to-end delay by using AIMD control, with pacing under extreme congestion. With accurate RTT measurement and care in reasoning about delay targets, we find this design is a foundation for excellent performance when network distances are well-known. Importantly, its simplicity helps us to meet operational challenges. Delay is easy to decompose into fabric and host components to separate concerns, and effortless to deploy and maintain as a congestion signal while the datacenter evolves. In large-scale testbed experiments, Swift delivers a tail latency of <50µs for short RPCs, with near-zero packet drops, while sustaining ∼100Gbps throughput per server. This is a tail of <3× the minimal latency at a load close to 100%. In production use in many different clusters, Swift achieves consistently low tail completion times for short RPCs, while providing high throughput for long RPCs. It has loss rates that are at least 10× lower than a DCTCP protocol, and handles O(10k) incasts that sharply degrade with DCTCP.

 

Paper: swift_paper.pdf

Material: Swift_slide.pdf

 

Thank you.

 

by jhsong


 
게시됨 : 2021년 09월 30일 12:23 오전