New Garbage Collection enabled on almost all satellites

mihalko · March 9, 2023, 6:37am

None of that should not be the case, but I will investigate further. Thank you for pointing the direction! I think of another services killing IO, but I need to wait for that situation to happen again (before OOM).

littleskunk · March 9, 2023, 12:30pm

Short update from the developer team:

AP1 is running ranged loop with audit observer, so far no visible problems :tada:
For now I uploaded manually fresh objects into EUN1 and SLC to workaround problem with GC bloom filter creation time
EUN1 GC was triggered few hours ago
SLC GC will be triggered today
I plan to bump ranged loop config for GC (EUN1, SLC) process 4 ranges instead default 2 to see how it will behave

naxbc · March 25, 2023, 9:06am

I just set “maximum uploads allowed” to 10 on Config.yaml
Don´t know what happened with the next release, but it was killing my node frequently.
DNS ok, nothing changed, just the new release.
Monitoring now with this setting.
I´ve used it in the past successfully to limit the load on a RPI.

Alexey · March 25, 2023, 10:07am

Sounds like a SMR disk used for storage.

naxbc · March 25, 2023, 10:21am

Hell no Come on @Alexey you should´ve known me better by now

Alexey · March 25, 2023, 10:24am

Maybe you started another node, how may I know?
Raspi3 was able to handle the load, so I do not know why yours cannot.

Vadim · March 25, 2023, 11:03am

last month load is 2-3x from old one, so may be it cant catch up anu more. Also it can be USB problem. lot of smol reads and writes, then raspberry is out of RAM

Alexey · March 25, 2023, 12:46pm

For the out of RAM problem we recommend to specify a RAM limit for the storagenode, see

Vadim · March 25, 2023, 1:13pm

and what will hapen with node when limit will be reachet? restart? and then Filewalker again will kill half of performance or even more. May be better to limit performance a litle.

Alexey · March 26, 2023, 5:02am

It respect the limit, however, if disk is too slow, it could try to take more RAM and will be killed, yes. But at least it will not hang the device.

michaln · April 3, 2023, 1:41pm

Another update, we plan to switch AP1 to updated GC process today or tomorrow. We will trigger GC right away after change will be applied to verify process.

littleskunk · April 4, 2023, 1:24pm

This time my storage nodes deleted some garbage from AP1 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6:

2023-04-04T04:49:31.579+0200   INFO    retain  Prepared to run a Retain request.       {"Process": "storagenode", "Created Before": "2023-03-31T11:59:59.679Z", "Filter Size": 239182, "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2023-04-04T04:49:41.365+0200   INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "num deleted": 1265, "Retain Status": "enabled"}

So far no piece not found download errors. It looks like the new garbage collection is running fine. Tomorrow we want to enable it on EU1. If you have a chance please take a look at your logfiles in the next couple of hours.