GC successful, bloomfilter not deleted, new bloomfilter not processed

jammerdan · May 14, 2024, 2:55am

While processing a (resumed) bloomfilter for 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs a new bloomfilter for that satellite has been received.

config.yaml
# how many concurrent retain requests can be processed at the same time.
# retain.concurrency: 5

Processing has finished:
2024-05-12T11:14:12Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"

But:

/config/retain
May  6 10:56 ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa-1714499999998475000
May  6 17:03 v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa-1714759199976870000
May 11 14:59 v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa-1715363999948993000

The bloomfilter for 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs dated 6th has been processed successfully but has not been not deleted from the retain folder.
There is no indication that the new bloomfilter dated 11th is being processed. There is no log entry about a new gc process for that satellite and there is no new date folder in the trash for that bloomfilter:

ls /storage/trash/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/
2024-05-06  2024-05-07

AFAIK new GC start would create a new date folder with date of start as name So I would expect to see a date folder wit name 11th or later.

Alexey · May 14, 2024, 4:09am

do you also have a succeed retain for this satellite and this BF?
Is the lazy mode on or off?
Does permissions matches other data in the config folder?

jammerdan · May 14, 2024, 4:26am

Do you mean this?
2024-05-12T11:14:27Z INFO retain Moved pieces to trash during retain {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 980751, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 2589893, "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Duration": "121h34m16.533753733s", "Retain Status": "enabled"}

It looks like finished to me, I don’t see an error.

--storage2.piece-scan-on-startup=false \
--pieces.enable-lazy-filewalker=true \

/config/retain
-rw-r--r-- 1 root root 2445764 May  6 10:56 ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa-1714499999998475000
-rw-r--r-- 1 root root 1224954 May  6 17:03 v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa-1714759199976870000
-rw-r--r-- 1 root root 1167648 May 11 14:59 v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa-1715363999948993000

The whole process seemed to be working as I think I even saw a bloomfilter for the Saltlake satellite which has been received, processed and deleted. I see a date folder there for 09/05. So this one had no issue in being deleted after processing.

It appears as if the problem exists because there are 2 files for the same satellite. It looks like it does not know which one to delete and which one to process now.

Alexey · May 14, 2024, 4:49am

Does all other data have the same permissions (root:root)?
Do you use --user option in your docker run command?

jammerdan · May 14, 2024, 4:51am

Which data do you mean?

No.

Alexey · May 14, 2024, 4:52am

everything else in the /app/config folder inside the container, storage subfolder and databases, blobs, trash, etc.

jammerdan · May 14, 2024, 5:02am

Yes. I mean I have successfully completed and working GC runs for Saltlake and AP-1.
US-1 is currently running, so I cannot tell the outcome
But this where I have 2 bloomfilters is the only one that shows this issue.

jammerdan · May 14, 2024, 5:52am

Looks similar to this one:

github.com/storj/storj

High CPU usage by node caused by receiving a new bloom filter while still processing the old one. Also new gc won't start after processing old bloom filter

opened 06:01PM - 05 May 24 UTC

andrewadp

Bug

## Description Receiving a new bloom filter while still processing the old one …cause high CPU usage by "storagenode" process. Also new gc won't start after processing old bloom filter. ## Steps to reproduce the issue: 1. The node receives a bloom filter from the US satellite and starts the gc-filewalker: ```2024-04-22T03:18:05Z INFO retain Prepared to run a Retain request. {"Process": "storagenode", "cachePath": "config/retain", "Created Before": "2024-04-10T17:59:59Z", "Filter Size": 1351678, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYLZHeLUtdps3S"} 2024-04-22T03:18:05Z INFO lazyfilewalker.gc-filewalker starting subprocess {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"} 2024-04-22T03:18:05Z INFO lazyfilewalker.gc-filewalker subprocess started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"} 2024-04-22T03:18:05Z INFO lazyfilewalker.gc-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"} 2024-04-22T03:18:05Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "createdBefore": "2024-04-10T17:59:59Z", "bloomFilterSize": 1351678} ``` 2. The node is large and using lazy mode. Garbage collection takes few days but hasn’t finished yet when a new bloom filter is received from the US satellite. After that, the node starts to create high CPU load. Additionally, bloom filters from other satellites were received and processed normally during gc caused by US bloom from step 1 (not sure if it affects the problem but it should be noticed) 3. The GC, which started in step 1, finally finished, but for some reason, the node doesn’t remove the old bloom filter and doesn’t start a new GC-filewalker for the bloom filter received in step 2. High CPU usage doesnt go away. No new gc for US satellite after that records in log: ```2024-04-28T13:45:38Z INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "piecesSkippedCount": 0, "Process": "storagenode", "piecesCount": 11148904, "trashPiecesCount": 5425545, "piecesTrashed": 5425545} 2024-04-28T13:58:39Z INFO lazyfilewalker.gc-filewalker subprocess finished successfully {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"} 2024-04-28T13:59:18Z INFO retain Moved pieces to trash during retain {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 5425545, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 11148904, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "154h41m12.177192812s", "Retain Status": "enabled"} ``` Every one of 10 nodes below except storagenode9 received new bloom while still processing the old one. As you can see all nodes except storagenode9 have high CPU load after that. ```CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 5bd8f030ec34 storagenode1 37.09% 427.2MiB / 31.25GiB 1.34% 31.2GB / 70.5GB 3.65TB / 17.6GB 72 11a7ff36c845 storagenode2 45.01% 516.2MiB / 31.25GiB 1.61% 267GB / 95.3GB 2.78TB / 175GB 136 68ff2fe420e6 storagenode3 38.60% 686.8MiB / 31.25GiB 2.15% 2.53GB / 57.9GB 3.3TB / 618MB 140 8a387756a84e storagenode4 90.21% 650.1MiB / 31.25GiB 2.03% 146GB / 75.1GB 3.04TB / 97.7GB 121 dc3125d0d3b1 storagenode5 48.91% 441MiB / 31.25GiB 1.38% 126GB / 81.3GB 2.19TB / 76GB 186 e7f99bf82505 storagenode6 97.39% 471.2MiB / 31.25GiB 1.47% 398GB / 98.5GB 3.17TB / 232GB 193 d77731e0486a storagenode7 42.65% 614.5MiB / 31.25GiB 1.92% 513GB / 120GB 3.49TB / 315GB 178 dd657883d73b storagenode8 71.35% 575.2MiB / 31.25GiB 1.80% 113GB / 98.1GB 3.95TB / 66.5GB 132 b3ecab746d68 storagenode9 4.27% 767.6MiB / 31.25GiB 2.40% 313GB / 101GB 2.16TB / 202GB 154 f99a67a846c9 storagenode10 91.78% 443.6MiB / 31.25GiB 1.39% 160GB / 90.1GB 2.54TB / 94.6GB 186 ``` ## Expected results Receiving new bloom filter when old one is still processing should not cause high CPU load. After processing old bloom filter node should start new gc-filewalker caused by new bloom filter. ## Received results Receiving new bloom filter when old one is still processing cause 100% CPU load by storagenode process. After processing old bloom filter new gc-filewalker caused by new bloom filter wont start. ## Possible Fix Right now only possible workaround is restart. After manually restarting the node, it removes the old bloom filter and starts GC for the new one. However, restarting is not an ideal way to fix the problem because some other lazy filewalkers (used-space) will start from scratch. ## Additional information I've seen this problem on many of my nodes and have confirmations from other SNOs. Additional info can be found here: #https://forum.storj.io/t/high-cpu-usage-by-nodes-caused-by-receiving-a-new-bloom-filter-while-still-processing-the-old-one/26007 ## Environment Ubuntu 22.04. Docker. Storj 1.101.3 and 1.102.3 (both versions affected) Both problems are critical and easy to reproduce, i really hope it will be fixed ASAP.

I don’t see especially high CPU usage though.

Alexey · May 14, 2024, 8:50am

What’s version of storagenode?

jammerdan · May 14, 2024, 8:52am

It is version 1.102.3

agente · May 14, 2024, 1:12pm

why not set retain.concurrency to 1 default?

littleskunk · May 14, 2024, 1:20pm

I believe the default value has changed. I am not sure in which version it was. The generated config file would still contain the old default value commented out.

jammerdan · May 14, 2024, 1:30pm

It seems that this would fix only the CPU issue that I cannot see on my hardware.

High CPU usage by node caused by receiving a new bloom filter while still processing the old one. Also new gc won't start after processing old bloom filter · Issue #6946 · storj/storj · GitHub

As temporary workaround you can try reducing --retain.concurrency from default 5 to something lower to reduce CPU load.

Andrew · May 14, 2024, 2:28pm

Even in version 1.104.1, the default value remains the same: 5. Release versions don’t have the fix yet.

I’m pretty sure you also have a CPU load problem (like in High CPU usage by nodes caused by receiving a new bloom filter while still processing the old one ) , just don’t expect that a single node would load all cores on your CPU. In this case, node would load one core to 100%. If you use Docker, please check the output of “sudo docker stats”, you’ll see CPU load caused by every node on the server.

And i believe that setting retain.concurrency to 1 should fix both problems: CPU load and processing new bloomfilter

jammerdan · May 14, 2024, 3:01pm

Too late. The node had restarted.
But I can confirm this issue now again:

The node is still processing the bloomfilter from the 6th on US-1. Now the second time interrupted and resumed meaning that it created a new folder with date of today.
Still the same old bloomfilter.
So this data could be up to 3 weeks old when it finally gets deleted. If it was collected in the initial date folder, it would get deleted immediately with the next trash chore run.

Andrew · May 14, 2024, 3:15pm

That doesn’t matter in my opinion. As far as I know, the trash-filewalker wouldn’t start cleaning directory until the garbage collector finished filling it. So, it’s not guaranteed in which case the trash will be stored longer: if the garbage collector uses a single directory for trash collected by a single bloomfilter or if gc creates a few directories with different dates on every restart. I would prefer if it creates a new directory every day because the gc-lazyfilewalker may work for even 7 days or more for a single bloomfilter. I still have a few directories from April 22 (which are being cleaned right now by the trash filewalker) because the garbage collecting took a very long time on those nodes.

jammerdan · May 14, 2024, 3:38pm

Ok. Then this could take a looooooong time until the data finally gets deleted from the node. And from the date name you could never be sure when this would be.

Toyoo · May 14, 2024, 10:43pm

I don’t remember any code that would prevent both types of file walkers from running together. Too tired to check that now, sorry, but I think I would notice a guard like that…

Ambifacient · May 15, 2024, 12:14am

On my nodes I’ve definitely observed gc-filewalker and trash-cleanup-filewalker running at the same time.

jammerdan · May 15, 2024, 1:56am

From the same satellite on the same date folder?
So like such a case where GC takes very long and the trash deletion kicks in. So you would have the GC moving sill pieces into the date folder while trash collection tries to delete the pieces and the folder.
I don’t know how that will work.