I noticed my 10 node on a pi4 setup is building up trash

andrew2.hart · December 23, 2020, 1:33pm

I noticed my 10 node on a pi4 setup is building up trash. That’s the first nearly a problem I’ve seen with it.

joesmoe · December 23, 2020, 5:06pm

I don’t think that trash is a bad thing, this just means that some files that weren’t needed were finally caught by the garbage collection script (at least i think thats how it works).

I see no reason why a SNO cannot operate hundreds of nodes, the point is about decentralization, but at the end of the day it is a free market.

Pac · December 23, 2020, 11:22pm

Trash shouldn’t pile up forever as they get deleted 7 days later.

So wait and see, it should decrease eventually.

BrightSilence · December 24, 2020, 11:00am

My largest nodes has collected about 40GB of trash over the last few days as well. This one is pretty much always online, so might be some cleaning up on the satellite end. No big deal though.

baker · December 24, 2020, 2:48pm

Interestingly, on my oldest, full 4TB node, I have only 250 MB of trash. But on my newer 1.5TB/2TB used node, I have 12 GB. Must be recent data that is being purged.

andrew2.hart · December 24, 2020, 3:13pm

You are right. Amazing observation.

BrightSilence · December 24, 2020, 3:25pm

Yep, I noticed the same thing.

ZhiYuan · December 24, 2020, 3:47pm

i got a pretty healthy node i guess …any one getting something similar?

Pac · December 24, 2020, 11:15pm

If you’re talking about the amount of trash, I think previous posts confirmed that there have been some deletions lately, yes.

My newer nodes have several GB of trash too (not as much as @BrightSilence, but up to 4GB nevertheless). Nothing to worry about if you ask me though…

kevink · December 25, 2020, 8:19am

Well my 7TB (full) node has only 86MB of trash. My 2 nodes with ~2TB have each 1.5GB trash… So my nodes can’t confirm any heavy deletions (which couldn’t be handled by the nodes live).

Alexey · December 25, 2020, 9:01am

My nodes have a low trash level too, so I can assume - any customers’ behaviors are normal.
They can delete or upload anything in any time.

Pac · December 25, 2020, 8:56pm

I read somewhere (or at least I understood at some point) that normal deletions do not create trash on Nodes, and that trash grows only if the node was offline when deletions happened.
In other words, we shouldn’t have any trash as long as Nodes are near 100% online becasue trash comes from the garbage collector.

Is that true? I guess not, because I have GBs of trash in my nodes despite having all nodes with 99.80+% to 100% of online scores. I must have misunderstood something

kevink · December 25, 2020, 9:28pm

If your node can keep up with the delete requests, then your trash will be empty. But if your node can’t keep up with the delete requests or is offline during the time, the next bloom filter will catch most of those and move them to trash.
I also remember something about “ghost segments”, remnants of failed uploads that have to be cleaned up (e.g. if a client canceled the upload at 50%, then all already uploaded segments need to be removed eventually).

Pac · December 25, 2020, 10:44pm

How could a node not keep up with deletes?
Are deletions particularly heavy for a reason I ignore?

Besides, how could a node give up gracefully on deletions? I mean usually when a node can’t keep up with ingress it simply crashes, get killed or something nasty like this… Why would it be different for deletions?
Coz if deletions are properly handled when a node can’t keep up, then I WANT the same mechanism implemented for ingress so we don’t have to worry about SMR drives being a problem anymore!

nerdatwork · December 26, 2020, 2:26am

There is a timeout for deletions. When the node fails to delete a piece in that time it gets picked up by the garbage collector.

BrightSilence · December 26, 2020, 9:05am

It’s generally true. But I don’t think that was the case with this large move to trash. Likewise my nodes are pretty much always online. The one with the most trash is at 99.966% up this past month. No way that this amount of trash could collect in the short time it was down during an OS update.

There haven’t been large amounts of deletes lately either, so I highly doubt it’s deletes that timed out. My bet would be on some zombie segments being cleaned up on the satellite end.

Though, if someone reads this later and has the same question, you shouldn’t just assume this is the case. It’s much more common that down time or node slowness is the cause. This seems to be an exception though.

rui_guedes · December 27, 2020, 2:09am

You have 10 nods running from a pi4?

andrew2.hart · December 27, 2020, 8:11am

Yes, but it is not as bad as it sounds. Because they are all on the same IP address they only do the work of one node.

top - 08:07:40 up 95 days, 15:04,  2 users,  load average: 1.62, 1.59, 1.53
Tasks: 188 total,   2 running, 186 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.5 us,  9.2 sy,  0.0 ni, 82.4 id,  0.6 wa,  0.0 hi,  0.3 si,  0.0 st
MiB Mem :   3826.9 total,    894.4 free,    781.3 used,   2151.2 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   2830.8 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                
  489 root      20   0 1095756  71804  30068 S   9.6   1.8  12346:26 dockerd                
 3265 root      20   0  819632  31620   9592 S   5.6   0.8 699:07.14 storagenode            
    1 root      20   0   33948   7684   5824 S   4.3   0.2 822:50.72 systemd                
 2876 root      20   0  820252  37408  11596 S   3.3   1.0   2375:36 storagenode            
 2339 root      20   0   14832   6512   5336 S   2.6   0.2 563:19.55 systemd                
  424 root      20   0  437772 381628   5260 S   2.3   9.7 170:24.17 udisksd                
  487 root      20   0 1058476  37204  16628 S   2.0   0.9 507:48.94 containerd             
 3167 root      20   0  819172  32928  11228 S   2.0   0.8 707:21.30 storagenode            
18207 root      20   0  805268   7596   5780 R   1.7   0.2   0:00.05 watchtower             
 1845 root      20   0       0      0      0 I   1.3   0.0   0:03.50 kworker/u8:0-netns     
  119 root      20   0   37720   8460   7524 S   0.7   0.2 178:05.77 systemd-journal        
 9400 root      20   0   10432   2540   1884 S   0.7   0.1   1374:35 top                    
18157 root      20   0   10428   2808   2424 R   0.7   0.1   0:00.09 top                    
18176 root      20   0   18512   2636   1712 S   0.7   0.1   0:00.02 systemd-udevd          
   10 root      20   0       0      0      0 I   0.3   0.0 144:12.47 rcu_sched              
   20 root      20   0       0      0      0 S   0.3   0.0  75:04.52 ksoftirqd/2            
   25 root      20   0       0      0      0 S   0.3   0.0  68:02.23 ksoftirqd/3            
   85 root      20   0       0      0      0 S   0.3   0.0 405:22.20 usb-storage            
  155 root      20   0   18512   4096   3176 S   0.3   0.1  58:49.44 systemd-udevd          
  398 avahi     20   0    6116   2932   2344 S   0.3   0.1  53:19.81 avahi-daemon           
  416 message+  20   0    6688   2872   2336 S   0.3   0.1  86:57.41 dbus-daemon            
 3370 root      20   0  819284  31824   9540 S   0.3   0.8 706:58.68 storagenode            
13990 root      20   0       0      0      0 I   0.3   0.0   0:00.36 kworker/2:3-events_po+ 
root@raspberrypi1:~# docker ps
CONTAINER ID        IMAGE                          COMMAND                  CREATED             STATUS              PORTS                                                NAMES
b29b26a226c6        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20009->14002/tcp, 0.0.0.0:10009->28967/tcp   storagenode9
4fece093de82        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20008->14002/tcp, 0.0.0.0:10008->28967/tcp   storagenode8
2ff8155ecd3c        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20007->14002/tcp, 0.0.0.0:10007->28967/tcp   storagenode7
f3af299a0cd5        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20006->14002/tcp, 0.0.0.0:10006->28967/tcp   storagenode6
74a195caaf4e        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20005->14002/tcp, 0.0.0.0:10005->28967/tcp   storagenode5
745384efa721        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20002->14002/tcp, 0.0.0.0:10002->28967/tcp   storagenode2
18ef828c4b03        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20004->14002/tcp, 0.0.0.0:10004->28967/tcp   storagenode4
c7821a666e93        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20001->14002/tcp, 0.0.0.0:10001->28967/tcp   storagenode1
684a230d9a05        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20003->14002/tcp, 0.0.0.0:10003->28967/tcp   storagenode3
723167dcc279        storjlabs/storagenode:latest   "/entrypoint"            2 weeks ago         Up 2 weeks          0.0.0.0:20010->14002/tcp, 0.0.0.0:10010->28967/tcp   storagenode10
66236cddd795        storjlabs/watchtower           "/watchtower storage…"   3 months ago        Up 5 seconds                                                             watchtower
root@raspberrypi1:~#

Firmw4re · December 27, 2020, 12:08pm

How did you connect those 10 nodes to your pi? USB hub? The bandwidth of those nodes is really low… not to talk about latency… but most important I think 10 docker nodes in a arm device is way to much… but that is just my opinion and I think the problem is as described… to much load for the poor pi

andrew2.hart · December 27, 2020, 1:37pm

I wonder if you have a problem with your raspberry pi 4.
I got 300MB/s in this post Post pictures of your storagenode rig(s)
Your 600Mb/s seems low in comparison.
Perhaps check the firmware is updated?
Is your cable damaged?

It is not a huge issue since the network load is low anyway