I noticed my 10 node on a pi4 setup is building up trash. That’s the first nearly a problem I’ve seen with it.
I don’t think that trash is a bad thing, this just means that some files that weren’t needed were finally caught by the garbage collection script (at least i think thats how it works).
I see no reason why a SNO cannot operate hundreds of nodes, the point is about decentralization, but at the end of the day it is a free market.
Trash shouldn’t pile up forever as they get deleted 7 days later.
So wait and see, it should decrease eventually.
My largest nodes has collected about 40GB of trash over the last few days as well. This one is pretty much always online, so might be some cleaning up on the satellite end. No big deal though.
Interestingly, on my oldest, full 4TB node, I have only 250 MB of trash. But on my newer 1.5TB/2TB used node, I have 12 GB. Must be recent data that is being purged.
You are right. Amazing observation.
Yep, I noticed the same thing.
If you’re talking about the amount of trash, I think previous posts confirmed that there have been some deletions lately, yes.
My newer nodes have several GB of trash too (not as much as @BrightSilence, but up to 4GB nevertheless). Nothing to worry about if you ask me though…
Well my 7TB (full) node has only 86MB of trash. My 2 nodes with ~2TB have each 1.5GB trash… So my nodes can’t confirm any heavy deletions (which couldn’t be handled by the nodes live).
My nodes have a low trash level too, so I can assume - any customers’ behaviors are normal.
They can delete or upload anything in any time.
I read somewhere (or at least I understood at some point) that normal deletions do not create trash on Nodes, and that trash grows only if the node was offline when deletions happened.
In other words, we shouldn’t have any trash as long as Nodes are near 100% online becasue trash comes from the garbage collector.
Is that true? I guess not, because I have GBs of trash in my nodes despite having all nodes with 99.80+% to 100% of online scores. I must have misunderstood something
If your node can keep up with the delete requests, then your trash will be empty. But if your node can’t keep up with the delete requests or is offline during the time, the next bloom filter will catch most of those and move them to trash.
I also remember something about “ghost segments”, remnants of failed uploads that have to be cleaned up (e.g. if a client canceled the upload at 50%, then all already uploaded segments need to be removed eventually).
How could a node not keep up with deletes?
Are deletions particularly heavy for a reason I ignore?
Besides, how could a node give up gracefully on deletions? I mean usually when a node can’t keep up with ingress it simply crashes, get killed or something nasty like this… Why would it be different for deletions?
Coz if deletions are properly handled when a node can’t keep up, then I WANT the same mechanism implemented for ingress so we don’t have to worry about SMR drives being a problem anymore!
There is a timeout for deletions. When the node fails to delete a piece in that time it gets picked up by the garbage collector.
It’s generally true. But I don’t think that was the case with this large move to trash. Likewise my nodes are pretty much always online. The one with the most trash is at 99.966% up this past month. No way that this amount of trash could collect in the short time it was down during an OS update.
There haven’t been large amounts of deletes lately either, so I highly doubt it’s deletes that timed out. My bet would be on some zombie segments being cleaned up on the satellite end.
Though, if someone reads this later and has the same question, you shouldn’t just assume this is the case. It’s much more common that down time or node slowness is the cause. This seems to be an exception though.
You have 10 nods running from a pi4?
Yes, but it is not as bad as it sounds. Because they are all on the same IP address they only do the work of one node.
top - 08:07:40 up 95 days, 15:04, 2 users, load average: 1.62, 1.59, 1.53
Tasks: 188 total, 2 running, 186 sleeping, 0 stopped, 0 zombie
%Cpu(s): 7.5 us, 9.2 sy, 0.0 ni, 82.4 id, 0.6 wa, 0.0 hi, 0.3 si, 0.0 st
MiB Mem : 3826.9 total, 894.4 free, 781.3 used, 2151.2 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 2830.8 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
489 root 20 0 1095756 71804 30068 S 9.6 1.8 12346:26 dockerd
3265 root 20 0 819632 31620 9592 S 5.6 0.8 699:07.14 storagenode
1 root 20 0 33948 7684 5824 S 4.3 0.2 822:50.72 systemd
2876 root 20 0 820252 37408 11596 S 3.3 1.0 2375:36 storagenode
2339 root 20 0 14832 6512 5336 S 2.6 0.2 563:19.55 systemd
424 root 20 0 437772 381628 5260 S 2.3 9.7 170:24.17 udisksd
487 root 20 0 1058476 37204 16628 S 2.0 0.9 507:48.94 containerd
3167 root 20 0 819172 32928 11228 S 2.0 0.8 707:21.30 storagenode
18207 root 20 0 805268 7596 5780 R 1.7 0.2 0:00.05 watchtower
1845 root 20 0 0 0 0 I 1.3 0.0 0:03.50 kworker/u8:0-netns
119 root 20 0 37720 8460 7524 S 0.7 0.2 178:05.77 systemd-journal
9400 root 20 0 10432 2540 1884 S 0.7 0.1 1374:35 top
18157 root 20 0 10428 2808 2424 R 0.7 0.1 0:00.09 top
18176 root 20 0 18512 2636 1712 S 0.7 0.1 0:00.02 systemd-udevd
10 root 20 0 0 0 0 I 0.3 0.0 144:12.47 rcu_sched
20 root 20 0 0 0 0 S 0.3 0.0 75:04.52 ksoftirqd/2
25 root 20 0 0 0 0 S 0.3 0.0 68:02.23 ksoftirqd/3
85 root 20 0 0 0 0 S 0.3 0.0 405:22.20 usb-storage
155 root 20 0 18512 4096 3176 S 0.3 0.1 58:49.44 systemd-udevd
398 avahi 20 0 6116 2932 2344 S 0.3 0.1 53:19.81 avahi-daemon
416 message+ 20 0 6688 2872 2336 S 0.3 0.1 86:57.41 dbus-daemon
3370 root 20 0 819284 31824 9540 S 0.3 0.8 706:58.68 storagenode
13990 root 20 0 0 0 0 I 0.3 0.0 0:00.36 kworker/2:3-events_po+
root@raspberrypi1:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b29b26a226c6 storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20009->14002/tcp, 0.0.0.0:10009->28967/tcp storagenode9
4fece093de82 storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20008->14002/tcp, 0.0.0.0:10008->28967/tcp storagenode8
2ff8155ecd3c storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20007->14002/tcp, 0.0.0.0:10007->28967/tcp storagenode7
f3af299a0cd5 storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20006->14002/tcp, 0.0.0.0:10006->28967/tcp storagenode6
74a195caaf4e storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20005->14002/tcp, 0.0.0.0:10005->28967/tcp storagenode5
745384efa721 storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20002->14002/tcp, 0.0.0.0:10002->28967/tcp storagenode2
18ef828c4b03 storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20004->14002/tcp, 0.0.0.0:10004->28967/tcp storagenode4
c7821a666e93 storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20001->14002/tcp, 0.0.0.0:10001->28967/tcp storagenode1
684a230d9a05 storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20003->14002/tcp, 0.0.0.0:10003->28967/tcp storagenode3
723167dcc279 storjlabs/storagenode:latest "/entrypoint" 2 weeks ago Up 2 weeks 0.0.0.0:20010->14002/tcp, 0.0.0.0:10010->28967/tcp storagenode10
66236cddd795 storjlabs/watchtower "/watchtower storage…" 3 months ago Up 5 seconds watchtower
root@raspberrypi1:~#
How did you connect those 10 nodes to your pi? USB hub? The bandwidth of those nodes is really low… not to talk about latency… but most important I think 10 docker nodes in a arm device is way to much… but that is just my opinion and I think the problem is as described… to much load for the poor pi
I wonder if you have a problem with your raspberry pi 4.
I got 300MB/s in this post Post pictures of your storagenode rig(s)
Your 600Mb/s seems low in comparison.
Perhaps check the firmware is updated?
Is your cable damaged?
It is not a huge issue since the network load is low anyway