Question about UDP

So I left this for a while (1h+):

tcpdump -i enp9s0 udp port 28967 -XAvvv &>/tmp/udp.dump

and here is what I got:

[root@localhost ~]# grep -Po "(?<=\s)\S+(?=\.\d+\s\>)" /tmp/udp.dump | grep -v localhost | sort | uniq -c | sort -k1nr
     64 277185.simplecloud.ru
     11 170.242.194.35.bc.googleusercontent.com
     10 215.39.75.34.bc.googleusercontent.com
      9 107.120.235.35.bc.googleusercontent.com
      9 141.139.23.34.bc.googleusercontent.com
      9 2.202.88.34.bc.googleusercontent.com
      9 95.99.198.35.bc.googleusercontent.com
[root@localhost ~]#

Just few sources and not so much traffic. What does this mean and how/when exactly is UDP involved in upload/download process?

Depending on settings, any connections between satellites, uplinks, and nodes may be attempted with QUIC (over UDP) and TLS (over TCP) at the same time. If a connection is fully established with QUIC/UDP first, then we use that connection and the TLS/TCP connection attempt is aborted. If instead the TLS/TCP connection is fully established first, then we use that connection and the QUIC/UDP connection attempt is aborted.

So, if UDP packets can be exchanged more quickly between the remote host and your node, you will see many more UDP packets. If UDP packets are generally exchanged more slowly with your node, you will see many more TCP packets. And of course, if UDP packets are blocked somewhere between the remote host and your node, or if the client is configured not to try UDP connections, you won’t see those packets at all.

6 Likes

What are those settings, can someone please point me to the docs, in the context of uplink talking to nodes and satellites?

Is it possible to force UDP/QUIC inspite of TCP being “faster”?

Case in point — one one the integrations (duplicacy backup program) when downloading from storj using native go library doing 40 concurrent downloads of relatively small files ends up creating over 60k (60 thousand) TCP connections per second, achieving throughput of just under 45MBps. While maximum number of ”established” connections rarely exceeds 2000, this happens to be enough to knock down my cable modem in seconds. (2000 concurrent established connections is still too much — default is 1000 on some OSes)

I’m wondering if it may be able to handle stateless protocols better, and therefore if there was a way to have uplink avoid TCP or at least “strongly prefer” UDP.

In these scenarios, latency does not matter, so having a global “do QUIC or fail” would probably address modem dying scenarios.

If there is already a configuration option to address that — I apologize, I could not find it, and please point me to it.

I did not find any tcp/udp-related option in their documentation either: Advanced Options - Duplicati 2 User's Manual

It’s duplicacy, not duplicati (and not duplicity either :)). This one: GitHub - gilbertchen/duplicacy: A new generation cloud backup tool

I was more asking not about configuring specific app, but rather uplink library that app is using; whether it is even possible to control that uplinks behavior programmatically. And if so — I would then reach out to app’s forum :slight_smile:

I did read it like it has libuplink under the hood. Many applications have options to provide additional flags/options to used tools.
I didn’t find a way, how you could provide an additional options to the storage backend though.

It’s possible, but it’s not documented as it’s not ready for production. (Mainly because the reasons what you mentioned: QUIC is not supported everywhere and you may get too many nodes without QUIC support → increased latency or even failure).

You can try to set environment variable STORJ_QUIC_ROLLOUT_PERCENT=100, and test it. (but use it your own risk ;-), worst case it will be slower or failed download)

This is strange. We don’t have 60k nodes, so it means that you have more than one connection to each storagenodes.

One single download requires a connection to the satellite + connection to 39 storagenodes (after downloading data from the fastest 29, the remaining will be closed).

So 60k connection means at least 1538 segment (= files for small files) download.

I guess this tool tries to download data massively parallel.

One option is increasing the connection pool with this function: uplink/private/transport/transport.go at 72bcffbeac33146027b0f70fc6fcd703fdf8bfb0 · storj/uplink · GitHub

The default size is 100, which is very low. you can even use 5000-10000, which will increase the performance but also increase the memory usage…

5 Likes

Thank you for quick and detailed response!

I did that, and it made no difference, everything goes through TCP still. Could there be something else that needs to be done? (running strings on the app does find STORJ_QUIC_ROLLOUT_PERCENT , so whatever version of uplink it’s using shall support that.)

% env | grep STORJ_QUIC_ROLLOUT_PERCENT
STORJ_QUIC_ROLLOUT_PERCENT=100
how I check tcp/udp send/rcv per second, bytes transferred, and total established connections
#!/usr/sbin/dtrace -qs
BEGIN
{

	printf("Printing packets and bytes sent and received per second\n");

	udp_count_sent = 0;
	udp_count_rcvd = 0;
	udp_bytes_sent = 0;
	udp_bytes_rcvd = 0;
	tcp_count_sent = 0;
	tcp_count_rcvd = 0;
	tcp_bytes_sent = 0;
	tcp_bytes_rcvd = 0;
	tcp_connect_est = 0;
	tcp_accept_est = 0;
	tcp_established = 0;
}

udp:::send
{
   udp_count_sent += 1;
   udp_bytes_sent += args[4]->udp_length;
}

udp:::receive
{
   udp_count_rcvd += 1;
   udp_bytes_rcvd += args[4]->udp_length;
}


tcp:::send
{
   tcp_count_sent += 1;
   tcp_bytes_sent += args[2]->ip_plength;
}

tcp:::receive
{
   tcp_count_rcvd += 1;
   tcp_bytes_rcvd += args[2]->ip_plength;
}

tcp:::connect-established
{
   tcp_connect_est += 1;
}


tcp:::accept-established
{
   tcp_accept_est += 1;
}

tcp:::state-change
{
  if (args[5]->tcps_state == TCP_STATE_ESTABLISHED)
  {
    tcp_established -=1;
  }

  if (args[3]->tcps_state == TCP_STATE_ESTABLISHED)
  {
    tcp_established +=1;
  }
}


tick-1sec
{
	printf("udp: >> %6d (%10d bytes), << %6d (%10d bytes) | ",
		udp_count_sent,
		udp_bytes_sent,
		udp_count_rcvd,
		udp_bytes_rcvd);

	printf("tcp: >> %6d (%10d bytes), << %6d (%10d bytes) | tcp_conn_est :%4d total: %4d\n",
		tcp_count_sent,
		tcp_bytes_sent,
		tcp_count_rcvd,
		tcp_bytes_rcvd,
		tcp_connect_est + tcp_accept_est,
		tcp_established);

	udp_count_sent = 0;
	udp_count_rcvd = 0;
	udp_bytes_sent = 0;
	udp_bytes_rcvd = 0;
	tcp_count_sent = 0;
	tcp_count_rcvd = 0;
	tcp_bytes_sent = 0;
	tcp_bytes_rcvd = 0;
	tcp_connect_est = 0;
	tcp_accept_est = 0;
}
what I see printed when the app is downloading

idle (background traffic)

udp: >>    127 (     54664 bytes), <<    198 (    208186 bytes) | tcp: >>   2652 (   1166231 bytes), <<   3596 (   3513197 bytes) | tcp_conn_est :   1 total:    0
udp: >>    131 (     75595 bytes), <<    172 (    161275 bytes) | tcp: >>   3539 (   1680162 bytes), <<   4452 (   4597223 bytes) | tcp_conn_est :   2 total:    1
udp: >>    191 (    119875 bytes), <<    190 (    163823 bytes) | tcp: >>   2708 (   1609739 bytes), <<   3294 (   3244730 bytes) | tcp_conn_est :   8 total:    5
udp: >>    239 (    255940 bytes), <<    229 (     98624 bytes) | tcp: >>   2996 (   1978123 bytes), <<   3409 (   3122358 bytes) | tcp_conn_est :   1 total:    4
udp: >>     96 (     99272 bytes), <<     90 (     25718 bytes) | tcp: >>   2245 (    775657 bytes), <<   3159 (   3169612 bytes) | tcp_conn_est :   3 total:    5
udp: >>    228 (    259987 bytes), <<    183 (     91507 bytes) | tcp: >>   2187 (    773510 bytes), <<   3134 (   3122000 bytes) | tcp_conn_est :   1 total:    4
udp: >>    257 (    292355 bytes), <<    209 (    125700 bytes) | tcp: >>   1561 (    563170 bytes), <<   2348 (   2449980 bytes) | tcp_conn_est :   0 total:    3
udp: >>    130 (     65386 bytes), <<    126 (    103852 bytes) | tcp: >>   2273 (    534998 bytes), <<   3432 (   3411775 bytes) | tcp_conn_est :   1 total:    2
udp: >>     78 (     28736 bytes), <<    110 (    103837 bytes) | tcp: >>   1541 (    378968 bytes), <<   2376 (   2415836 bytes) | tcp_conn_est :   2 total:    2
udp: >>    102 (     32356 bytes), <<    158 (    171203 bytes) | tcp: >>   1473 (    497230 bytes), <<   2278 (   2398904 bytes) | tcp_conn_est :   3 total:    4

Downloading::

udp: >>    107 (     93566 bytes), <<     95 (     19662 bytes) | tcp: >>  12650 (   2639323 bytes), <<  10093 (   8164445 bytes) | tcp_conn_est :1534 total: 1500
udp: >>     60 (     17754 bytes), <<     83 (     77241 bytes) | tcp: >>  13272 (   2974369 bytes), <<  20022 (  21294655 bytes) | tcp_conn_est :  40 total: 1537
udp: >>    123 (     29462 bytes), <<     93 (     26874 bytes) | tcp: >>  36909 (   1940520 bytes), <<  67390 (  89643142 bytes) | tcp_conn_est :  11 total: 1523
udp: >>    122 (     41355 bytes), <<    162 (    164638 bytes) | tcp: >>  43025 (   1909269 bytes), <<  76678 (  98852640 bytes) | tcp_conn_est : 147 total:  972
udp: >>     56 (     25337 bytes), <<     47 (      9692 bytes) | tcp: >>  28303 (   2428324 bytes), <<  45393 (  55572094 bytes) | tcp_conn_est : 587 total: 1206
udp: >>     80 (     37945 bytes), <<     89 (     82204 bytes) | tcp: >>  28434 (   2581370 bytes), <<  45765 (  55890175 bytes) | tcp_conn_est : 557 total: 1370
udp: >>    149 (     30585 bytes), <<    204 (    222293 bytes) | tcp: >>  32942 (   2452684 bytes), <<  57112 (  72140648 bytes) | tcp_conn_est : 230 total: 1253
udp: >>     86 (     24508 bytes), <<     58 (     12881 bytes) | tcp: >>  33397 (   2313519 bytes), <<  57612 (  73481513 bytes) | tcp_conn_est : 341 total: 1059
udp: >>    122 (     50901 bytes), <<    127 (    116636 bytes) | tcp: >>  25795 (   2280253 bytes), <<  41908 (  49971683 bytes) | tcp_conn_est : 557 total: 1174
udp: >>     87 (     21342 bytes), <<    126 (    141607 bytes) | tcp: >>  24198 (   2013820 bytes), <<  40738 (  50045653 bytes) | tcp_conn_est : 417 total: 1463
udp: >>     93 (     23134 bytes), <<    126 (    139536 bytes) | tcp: >>  40831 (   2582379 bytes), <<  70052 (  91028777 bytes) | tcp_conn_est : 232 total: 1317
udp: >>     67 (     20227 bytes), <<     72 (     54007 bytes) | tcp: >>  39475 (   2332715 bytes), <<  68819 (  88815774 bytes) | tcp_conn_est : 267 total: 1198
udp: >>     74 (     14607 bytes), <<     70 (     29302 bytes) | tcp: >>  25524 (   1794879 bytes), <<  43039 (  54750429 bytes) | tcp_conn_est : 197 total: 1040
udp: >>     65 (     23632 bytes), <<    109 (     82562 bytes) | tcp: >>  31035 (   2435545 bytes), <<  49978 (  61148653 bytes) | tcp_conn_est : 701 total: 1208
udp: >>     57 (     12159 bytes), <<     86 (     23097 bytes) | tcp: >>  25650 (   2205685 bytes), <<  41548 (  50542182 bytes) | tcp_conn_est : 390 total: 1190
udp: >>     93 (     18400 bytes), <<    113 (    103447 bytes) | tcp: >>  29619 (   2308441 bytes), <<  48896 (  60578498 bytes) | tcp_conn_est : 484 total: 1362
udp: >>    145 (     19282 bytes), <<    260 (    330754 bytes) | tcp: >>  31868 (   2200594 bytes), <<  55130 (  69999336 bytes) | tcp_conn_est : 343 total: 1200

I misspoke, to clarify, I’m measuring number of tcp:send/receive per second, and the number of total established TCP connections: it was under 1-2k, but when it was approaching 2k the modem seem to be having troble.

I’m trying to figure out whether this is a problem (large number of established connections) or something else (TCP traffic vs UDP).

Yes; it supports concurrent downloads from every backend, and in these tests I’m setting 40 threads. Setting single-threaded download results in measly 2MBps download throughput, likely because of time-to-first byte dominating the transfer time: the files app creates by default are around 4MB. It is possible to configure the chunk size to be closer to 64 gb during backup, and this helps with performance, but it makes storage utilization far less efficient (for other backends, perhaps for storj it will be the right thing to do), so it’s likely not a viable workaround.

This is interesting, I’ll play with it.

I don’t think we have an option for “use QUIC only and don’t allow TCP”, at this point, unfortunately. STORJ_QUIC_ROLLOUT_PERCENT can be used to disable QUIC entirely (by setting it to 0), but 100 means “always try QUIC in addition to TCP”.

I like @elek’s suggestion of increasing the connection pool size so that fewer connections will be established.

1 Like

I could not figure out how to do it properly, so I cheated, and changed the default in rpc/dial.go instead:

-		Capacity:       100,
+		Capacity:       5000,

and rebuilt the whole thing. I now see 6000+ active connections from 1000+ before the change:

before: 
tcp_conn_est : 232 total: 1317
tcp_conn_est : 267 total: 1198
tcp_conn_est : 197 total: 1040
tcp_conn_est : 701 total: 1208
tcp_conn_est : 390 total: 1190
tcp_conn_est : 484 total: 1362

after:
tcp_conn_est : 217 total: 6109
tcp_conn_est : 242 total: 6058
tcp_conn_est : 372 total: 6088
tcp_conn_est : 205 total: 6193

But the connection churn remained the same – 200-400 new connections created and closed every second, albeit without occasional spikes to 700-800 as I was seeing before, so unless it’s a fluke, it seems to be an improvement too. Download performance improved from 45MBps to 52Mbps (I’ve ran each one few times)

The the app however failed to download some files with uplink: metaclient: context canceled after about 20 mins of running.

I’m not sure what to make of it. Perhaps I shall increase the pool size to 20000 and see what happens.