TCP out of memory

Alexey · November 9, 2023, 7:59am

And it shouldn’t. The code did not change in this regard, so it likely some slowness with a disk subsystem, or just more requests from the customers and now your network is not able to handle it.

P.S. this is a surprise for me, that someone could have a out of memory for TCP connections/sockets.
By the way, try to increase it like my node has (automatically! I did not specify this value anywhere):

freedata · November 9, 2023, 5:54pm

fsck completed without finding any errors.

This node has been online since the beta around mid 2017 without issue during all the test traffic. I don’t have the stats to compare traffic to this node from the last couple weeks to that time period to know how much more/less tcp connections there are.

I have only seen this type of TCP out of memory on the highest volume scaler nodes in data centers where the kernel doesn’t automatically select an ideal tcp_mem setting on its own, or there is a TCP socket leak.

I have increased this nodes’ tcp_mem to your settings. I’m guessing your system has a lot more ram than this node (2GB), so the kernel automatically set those larger values proportionately for your system.

Toyoo · November 9, 2023, 11:02pm

Customer behavior is changing though. Sometimes I see tons of small files being uploaded, sometimes evidently some large backups, etc. Sometimes almost all connections come only from the MT gateways, sometimes it’s traffic from Hetzner (repairs), sometimes from many different IPs. Node setups should handle all sorts of traffic, and by a node setup I mean everything from the node’s configuration to your ISP’s last mile hardware—I’ve seen the latter failing trying to download a popular torrent file, let alone hosting a node.

Alexey · November 10, 2023, 8:20am

You are correct. My system has 24GB free…

CONTAINER ID   NAME           CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O   PIDS
79496c65dd50   storagenode5   0.04%     93.2MiB / 24.81GiB    0.37%     13.5GB / 7.66GB   0B / 0B     44
f91d72108e6d   storagenode2   5.33%     135.1MiB / 24.81GiB   0.53%     17.7GB / 9.1GB    0B / 0B     95

shaneknapp · February 10, 2024, 3:48am

TL;DR: tcp_mem is set at boot by the kernel, and is wholly dependent upon your systems ram. increase the ram and it will probably go away. then begins the wompus humt. here be dragons.

i went through this step a few months ago, and just randomly found this thread whilst randomly searching for ‘consider tuning tcp_mem’.

i work for uc berkeley and run a ~14k users/semester jupyterhub deployment (~20 hubs), so we get a loooot of network traffic. we started seeing this message after scaling up to this number of users, and it would end up taking the entire system down. this is BAD.

after a little re-architecting (aka isolate the pods by workload, effectively isolating the issue) we decided to address this vs tracking the port leak in some nodejs package that was the dep of a dep of our ingress controllers. anyways.

this article is from red hat, but i did the math on both the ubuntu 22.04LTS and gcp container os and my results matched (almost perfectly) what was expected.

the link will probably take you to their search page, but just query ‘tcp_mem’ is it’s the first result. the article is called ’ How net.ipv4.tcp_mem value calculated?': How net.ipv4.tcp_mem value calculated? - Red Hat Customer Portal

secondly, i just discovered this post, which offers up some good insights in to the sysctl settings used for high usage prod services (RIP lastfm): Linux Kernel Tuning — Russ Garrett

this is almost 15 years old but still remains pertinent.

if folks are curious, i can share the settings that we have deployed. after reading russ garrett’s lastfm blog, i think that i need to rethink what we’re doing. there was a LOT of hand-waving when coming up with these numbers… and i also hadn’t figured out exactly how tcp_mem was set after committing!

(edit: i can’t post 3 links as a new user but i’d be happy to share my incorrect settings in a comment)

anyways, we were running our jupyterhub core pods and the corresponding (1:1) configurable-http-proxy pods on the same GCP node (user pods were located elsewhere). moving from a GCP n2-standard-8 to n2-highmem-8 (32 to 64G) doubled our tcp_mem and put an end to mid-semester outages. omg. so happy.

it didn’t fix the problem but at least the smaller outages means more time to debug.

hope this helps!