QUIC Misconfigured

This was a long conversation! Seems like it wrapped but I just wanted to offer some additional assurance that I think the QUIC thing is a good plan. I’ll start by reiterating some things @BrightSilence said:

  • As we roll out QUIC, we intend to leave support for TCP. To add on to that, we have no current plans to remove support for TCP. That of course may change based on how the QUIC rollout goes, but before we can even investigate if QUIC seems reasonable, we need nodes to enable it, hence the new and improved indicator.
  • We want to do these tests to understand exactly how widespread ISP throttling of UDP, etc. is. If ISPs are horrible and by and large break QUIC, obviously we won’t continue! That said, the existing adoption of QUIC in web browsers gives me some comfort that the situation might not be so bad these days in practice.
  • We are not going to switch to something that seems worse for SNOs or clients.

Thanks @BrightSilence!

To add a bit more color, in the last conversation about QUIC with @anon27637763, I discussed congestion control. I want to bring that up again because it’s worth pointing out that our explicit interest in QUIC is to be able to deviate from what’s out of the box. We do not intend to use the standard QUIC congestion controller Google provides in the long term (though we do in the short term). The reason we’re interested in QUIC is precisely because the application code will own details about flow control and we want to tune it. We simply cannot do that with TCP, as the operating system owns that logic. As @BrightSilence said, QUIC will help us reduce round trip latency by eliminating some TCP+SSL handshake back and forth, but we could probably get pretty far by tuning TCP and SSL if that’s all we cared about. What we’re interested in with QUIC is that it (or any other session oriented protocol on top of UDP) allows us to break away from what QUIC or TCP do by default. We get to change the default behavior!

If this went over anyone’s head, here’s a quick background summary. The internet functions on “IP packets”, which are basically post cards that have a source IP and a destination IP on it. When you deposit an IP packet onto your network card, the network card sends it towards the next hop in its routing table and then forgets about it. That’s the end of its job! Maybe that packet will get there or not, who knows in what order it will arrive with other IP packets.

That’s why TCP was developed. TCP adds packet acknowledgement, ordering, port numbers to identify which process, and so on, on top of IP packets. TCP is a pretty cool protocol that changes IP’s post card-like behavior into more reliable telephone conversations. Unfortunately, it is baked into your operating system, and so if there are any behaviors you want to change about it in any way (as a multiplatform software developer for users who don’t want to install kernel modules), you’re out of luck.

Understanding not every process might want TCP forced on it, UDP was added to the TCP/IP suite [1], which, to be honest, isn’t really anything more than an escape hatch. UDP is just IP packets with a port number. UDP is a way to give a specific program, instead of your operating system, the ability to directly send IP packets (but with a port number).

You could absolutely build TCP on top of UDP… which is what QUIC is. QUIC is a little more complex in that it interweaves TLS/SSL as well, chooses different default behaviors than TCP, but that’s the high level picture. QUIC is like an attempt to build TCP again, but based on things we’ve learned since the 1970s, and the application implements the TCP algorithm/protocol instead of your operating system.

Once we at Storj get QUIC working at all, I believe our intention is to evaluate swapping QUIC’s congestion controller out for one based on LEDBAT (what Bittorrent uses) or something similar.

Ultimately, Storj is in a relatively unique position. TCP (and default-QUIC) are optimized for a handful of streams competing for bandwidth at most (and I agree with @anon27637763 that they don’t always compete fairly), whereas that’s not even the position we’re in. We commonly multiplex thousands concurrently from a single client. There are performance and efficiency gains we are leaving on the table by using congestion controllers that aren’t aware of this. The standard QUIC implementations won’t solve this problem for us, but by using an application-layer flow control such as QUIC, it opens the door to allow us to consider making and tuning these application-side optimizations down the road. Yes, we do intend to make fundamental changes to what you get out of the box with QUIC.

For SNOs, I expect the initial QUIC rollout won’t do much. For clients, I hope it cuts down on first byte latency. I believe our initial tests will be to have clients simply try dialing QUIC and TCP concurrently and just choosing what’s faster. If QUIC does start doing well there, SNOs that enable QUIC might win more uploads in the upload long tail cancelation race. But the real gains I expect to see once we start to get to the new opportunities enabled by being able to tune more things.

Ultimately we’ll most certainly always need to support TCP. But, despite points raised in this thread, we do continue to believe that adding support for something besides TCP (maybe that starts with QUIC) will bring a better network experience to everyone. Because of industry backing, QUIC is an easy next step, and likely to be better treated by ISPs than if we rolled our own thing. :slight_smile:

@anon27637763 - in the previous thread you turned down a request to help debug why you were concerned about UDP. That’s fine and you’re welcome to take that position of course. Thanks for everything else you do to be a dedicated SNO! That said, I don’t want to discourage others from helping us make sure this works well (if it’s possible for this to work well). It will require testing, and so we appreciate everyone’s willingness to enable QUIC!

[1] does anyone here remember fumbling with IPX/SPX or NetBIOS? That’s my alternate history short story I would write. Everything is the same except IPX/SPX won and TCP/IP is the thing you fondly remember trying to figure out for LAN parties. And maybe we’re still crimping coax.

16 Likes