13 Dec 2022
I have been working on the Picoquic implementation of QUIC since 2017. Picoquic distinguished itself by performing very well on GEO satellite links. The main reason is that 40 years ago, I was studying protocols for transport of data over satellite links for my PhD. So, of course, I wanted to support the scenario well in my implementation of QUIC. Which explained why this morning someone was asking me about the ACK rate tuning work in Picoquic and why it was getting good performance results over GEO. Turns out that I never wrote that down, so here it is.
Sending fewer ACKs reduces transmission overhead and message processing load, which is a good thing. Historically, ACKs were also used for ACK Clocking: if ACKs are sent very often, each one acknowledges few packets, and thus opens the congestion window just enough for allowing a few more packets to be sent. If ACKs were too sparse, each would provide many credits, causing implementations to send packets in large bursts, maybe causing congestion on the path. But most implementations today implement some form of pacing, so ACK Clocking is not necessary anymore to prevent such packet bursts. Of course, if having fewer ACKs reduces overhead, it also impacts RTT measurements and packet loss detection, so there is a limit to how few ACKs a transport implementation should send.
This was discussed in QUIC Working Group. The discussions resulted in the publication of the QUIC Acknowledgement Frequency draft. The draft defines a QUIC control frame, by which the sender of packets can tell receivers how many packets or how much time they should wait before sending an ACK. However, that draft only provides generic guidance on how these parameters shall be sent. Picoquic implements the draft, and sets the packet threshold and ACK delay as follow:
The coefficients above were set in an empirical manner, based on a simulations of a variety of network configurations. Each of these simulation is actually a test case in the Picoquic suite of tests, which would detect if a code change caused a performance regression in one of the configurations. These simulations include several GEO configurations, including for example simulation of a high bandwidth data path and a low bandwith return path. In that asymmetric configuration, having too many ACKs would cause congestion on the return path, but the chosen tunings avoid that.
In that asymmetric configuration, limiting the number of ACKs is not enough. QUIC ACK frames could grow very large if they are allowed to carry a large number of "ACK ranges". If the ACKs were too large, that too could saturate a narrow return path. Picoquic limits the number of ACK ranges to 32, and further limits the size of ACKs by not including ranges that are too old, were already acknowledged, or were already announced in 4 previous ACK. And with all that, yes, we end up with good ACK behavior on GEO satellite links. And on other links too.
If you want to start or join a discussion on this post, the simplest way is to send a toot on the Fediverse/Mastodon to @huitema@social.secret-wg.org.