wait does that log meant that the garbage collection for 1 satellite took 25 hours?
Yupā¦ so if youāre keeping score in total thatās about 58 hours, 58 minutes, to move garbage which then will need another run to delete it in 7 days. And itās only probably about 116GB of data at that. That retain rate is about 47 GIGs per day, so itād take about 1 month to clear ~1,4TB. Good thing TTL (SLC) is thing, or itād never even cycle before it all expired.
2 potatoes
Retain is limited to 1 concurrent process: https://review.dev.storj.io/c/storj/storj/+/13081 it might be that with badger cache on more than one retain could run in parallel and speed things up.
And also there is this setting in config.yaml:
# how many piece delete workers
# storage2.delete-workers: 1
It sounds like this setting can change the number of parallel deletions. This might increase the speed of deletion too.
yeah it sure might in crease speed, and/or put more parallel IO pressure, to make it more efficientā¦ to a point, then less.
I thought I would share my numbers, I set up the badger on my most problematic, sad node.
Node setup: single core, only 1GB RAM (and 1 GB swap), and the storage is mounted over NFS. Yes, Alexey has already scolded me for this. This node had occassional problems of filewalkers not finishing with ācontext canceledā, or just taking days and days to run and not finishing before a restart.
I used docker and set up the badger cache so it was on SSD, (just like the storj db files were already on SSD). The data size for storj is about 11TB.
The node seemed to struggle under the increased load. It seems to restart sometimes (the āonlineā time in the dashboard resets, and the filewalkers, if previously running, start again without a completion or failure message in log. Iām not sure what to look for to see why the restart is happening.
But while the filewalkers are running and the node is still operating, the filewalkers running, the badger cache is being built, uploads still uplaodingā¦ the node seems to get ram constrained, in that the swap file fills up.
At the moment the node is in more of a steady state. There is still a used space filewalker running but the badger cache is more built up and the disk is full so no more uploads. And now the system shows about 500MB of swap occupied but also 500MB of ram free. Oh and docker stats shows 330MB used by storagenode, but Iāve seen over 500MB and maybe 800MB when things were busier).
Size-wise, my badger directory is just over 900MB. Thatās with 11TB of storj data, although a used space filewlaker is still running for US1. (correction: 1.7GB after all filewalkers done)
Reliability-wise, I have the spontaneous reboot issue, which may or may not be new, and also still have had filewalkers fail for context canceled, which is definitely an old issue.
Subjectively I definitely see more activity on the SSD hosting the badger cache. From virtually none to some pretty signficant usage, often in short spikes.
Also, when the filewalker is running and using badger cached info, it seems like it actually works the storage array more. The hard drive with the data and the cache SSD with the metadata are showing more transactions and higher busy %. Itās almost like running four non-badger filewalkers .
Oh, and time, the most important part.
Before, running used space filewalker for SLC tookā¦ really long? I wasnāt even able to get a single run to finish it. But at least 16 hours. At least.
After the badger is built, a used space filewalker for SLC takes 1 hour.
So TL;DR, the cache hasnāt really made the node more reliable, but the filewalkers are now so much faster, now they actually have a chance to finish before an error.
Compared to sometimes running for days, or not completing at all: thatās a huge improvement!
If itās working in your configuration - I have no objections. Just 1GB of RAM is too small for any network filesystem and storagenode. Because storagenode would buffer all writes to the ram, if the disk subsystem cannot keep up, and NFS unfortunately provokes this. For NFS you need at least 4GB of RAM to be stable. Also depends on NFS configuration on both, the server and the client. But seems you already did correctly, since your node is still alive, just too low amount of RAM.
I guess that either because of OOM (you may search in journalctl
for OOM
) or because of failed readability/writeability check (you may search for Unrecoverable
and/or FATAL
in your logs).
This is a good result, thank you! I didnāt expect that with NFS it could use so low amount of RAM.
ah that explain whatās Iāve seen if the NFS connection goes down. the storagenode starts buffering, the docker runs out of ram, and then the whole node sort of runs out of ram and almost becomes unusable. Often requires a full reboot. Of course, if the NFS connection goes down, then no amount of RAM will save the node. If only someone had warned against using NFS mounts
Yes, but in most setups NFS provokes more high RAM usage: Search results for 'memory usage #nfs order:latest' - Storj Community Forum (official)
So, your setup seems a little better (it could be possible that you used a separate network for NFS shares).
here we go: Step 1. Understand Prerequisites - Storj Docs
Good guess, journalctl shows this OOM thingy:
kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=init.scope,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-fbf712c8388ece9816713d7808d15995045edb6b5e38da9512b68092b819493f.scope,task=storagenode,pid=38792,uid=0
Do you know, is this the whole OS running out of RAM, or is it controlled by the limit I set in docker? (I had set a 800M limit) It feels like itās OS level but Iām pretty ignorant on this stuff.
try this config option:
filestore.write-buffer-size: 256.0 KiB
ā¦or 128 KiB, whatever.
And/Or:
storage2.max-concurrent-requests: 5
ā¦ or 50, whatever.
default: 0 (infinite, or 1000 probably)
So your node isnāt forced to max out and go Boom!
2 cents
This is on a 19.5TB node, initial filewalk with badger took like 5+ days, it had multiple restart due to update, so didnāt have the initial filewalk total time. This is after badger cache, and it only took a little less than 2 hours to finish
2024-08-23T20:49:18Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-08-23T20:52:42Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Lazy File Walker": false, "Total Pieces Size": 979614147840, "Total Pieces Content Size": 978551361792}
2024-08-23T20:52:42Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-08-23T20:53:36Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Lazy File Walker": false, "Total Pieces Size": 192093495552, "Total Pieces Content Size": 191733043456}
2024-08-23T20:53:36Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-08-23T22:10:19Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Lazy File Walker": false, "Total Pieces Size": 11891298817280, "Total Pieces Content Size": 11869725311744}
2024-08-23T22:10:19Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-08-23T22:42:52Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Lazy File Walker": false, "Total Pieces Size": 5919990784810, "Total Pieces Content Size": 5902267177258}
It could be both. But if you have OOM (which I will expect with NFS and only 1GB of RAM), then the OS perhaps do not have even 800MB free (not all OSes are equalā¦).
Theoretically, the limit should advertise to the container only this amount of available RAM and the application could consider this (run a GC more often if itās close to the limit, or evict some buffers, etc.), so the application could manage, how much RAM it actually can takeā¦ Not sure that this is implemented in storagenodeā¦
Itās infinite by default (0).
Thanks, you just confirmed my assume, that even if the node cannot manage to finish the filewalker, with a badger cache it would be able to finish it actually after several restarts (because every time it would move further).
one thing that I see that after badger cache activating windows use big amount of RAM, i think it caching this files, so it indirect use.
it is Server with 17 nodes.
Itās also do the same for the docker node:
comment CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
badger=true, lazy=false d8aab63f299d storagenode2 10.15% 1.135GiB / 24.81GiB 4.57% 251GB / 102GB 0B / 0B 93
badger=false, lazy=true 3d20fef76e67 storagenode5 7.08% 136MiB / 24.81GiB 0.54% 54.9GB / 21.9GB 0B / 0B 86
$ free
total used free shared buff/cache available
Mem: 26010664 2481748 12575016 11488 10953900 23109976
Swap: 7340032 0 7340032
However, seems itās not like so for the memory constrained systems, like @snorkelās one or @jammerdanās. But I want to reverify it when their systems would finish the used-space-filewalker with the badger cache enabled. Would be interesting to see a memory and CPU footprint. I would expect that it should become normal when all scans are completed.
Correcting myself, my badger cache for 11TB of data is about 1.7GB once I finally for sure finished all filewalkers.
The badger cache seems to be running but I cannot spot a log entry associated with it.
Am I missing something?
log level Info? there is not a lot go to log about budger itself