Massive reads on startup

jammerdan · May 26, 2023, 3:09am

Even with storage2.piece-scan-on-startup: false we are seeing crazy massive reads on node startup. What is this?

Alexey · May 26, 2023, 5:13am

More like a checking/cleaning trash, but you may use a debug endpoint to see what’s this.
See

jammerdan · May 26, 2023, 7:44am

It’s causing
rMB/s: 7.51, rrqm/s: 1724 and a utilization up to 100% for minutes on every startup.

If that’s required just for cleaning trash, that sounds insane.

Alexey · May 26, 2023, 8:17am

Only trace from the debug endpoint can say, what is exactly it doing.

jammerdan · May 26, 2023, 1:17pm

Is it this one?

[7530092614813486020,3333302239938971669] storj.io/storj/storagenode/pieces.(*TrashChore).Run() (elapsed: 6m54.901722685s, orphaned)
 [1942410985489664106,3333302239938971669] storj.io/storj/storagenode/pieces.(*Store).EmptyTrash() (elapsed: 41.157451163s)
  [5081579554688100975,3333302239938971669] storj.io/storj/storagenode/blobstore/filestore.(*blobStore).EmptyTrash() (elapsed: 41.157451433s)
   [8220748123886537845,3333302239938971669] storj.io/storj/storagenode/blobstore/filestore.(*Dir).EmptyTrash() (elapsed: 41.157452684s)
    [2136544656230198906,3333302239938971669] storj.io/storj/storagenode/blobstore/filestore.(*Dir).walkNamespaceInPath() (elapsed: 41.157453838s)

It looks like something about the trash.
I have just checked the size of the trash folder, it is 175GB on this node?
What exactly is this chore doing on every startup, no matter how recent the last start was?

Alexey · May 27, 2023, 1:58am

I think so. It also doing that every hour:

docker exec -it storagenode2 ./storagenode setup --help | grep "collector.interval"
      --collector.interval duration                              how frequently expired pieces are collected (default 1h0m0s)

jammerdan · May 27, 2023, 2:54am

But what is this process doing? Is it going through every single file in the trash? And for what? And why on every startup even if the last start was just a couple of minutes ago? This does not make any sense.

I haven’t seen the same load like on startup every hour. So I am not sure if this is the same thing.

Alexey · May 27, 2023, 3:02am

It checks last modified time to remove expired pieces and refreshes the cache (DB). I think it also checks all pieces in blobs, registered with expiration date, it’s stored inside pieces as a metadata.

jammerdan · May 27, 2023, 3:30am

Maybe this design needs a rework like the lazy-filewalker.
It does not sound to me like task that is required on every startup nor to run with high or even normal priority.
Also going through every file on startup again and again sounds like strange design. Modification times should not change I guess at least once a file has been moved to the trash. Therefore reading it from a file once should be sufficient.

Alexey · May 27, 2023, 4:44am

And how it supposedly should figure out, which files are needed to remove from the trash without checking the modification time and comparing it with the current date?
If does not check dates, files could be here more than for 7 days, and we likely see complains about the next “bug” - why is it not removed asap when 7 days are passed.

jammerdan · May 27, 2023, 5:53am

Don’t we have an expiration database? I mean the moment we move a file into the trash we know that in +7 days it has to get removed. So this information could be stored in the database?
And then a low priority background task could query the database over the day whenever the node is not very busy and delete pieces.
But in any case, if the day is the relevant factor, checking the dates would be sufficient once a day.

striker43 · May 27, 2023, 6:39am

I would love to be able to set a time as a config parameter at which these processes start every day. Right now it looks like it’s running on startup and then every day again around the container start-time.
That means that on my raspberry pi that is connected to multiple HDD‘s and has multiple nodes, the process usually runs for all storagenodes at the same time, because after a reboot all of them are started together.

Would be really nice to set different times for different nodes on the same machine to avoid overloading my little raspberry for one-two hours every day. Rest of the day he is more or less relaxed.

jammerdan · May 27, 2023, 6:42am

Fixed times are always bad as you really don’t know when the node gets hammered by up- or downloads or whatever is required.
In any case it would be required to define those tasks as low priority so that would not interfere. I think some resource optimization of the node software would be good.

Alexey · May 27, 2023, 7:16am

This database is basically cache and it’s updated when scanned pieces.
You may post it as an idea in the Storage Node feature requests - voting - Storj Community Forum (official) category or as an issue on our GitHub.

Alexey · May 27, 2023, 7:20am

you may start nodes with a delay, see

or write a batch script:

#!/bin/bash
docker compose up -d storagenode1
sleep 3600
docker compose up -d storagenode2
sleep 3600
docker compose up -d storagenode3

However, if your nodes on a separate disks, the parallel execution should not be a big problem.

jammerdan · May 29, 2023, 9:12am

And as these massive reads seem to be repeating I would like to know what is the current interval for the TrashChore as there seems to be no config setting to influence it. So what makes it start apart from node startup and at which times?

jammerdan · May 29, 2023, 2:15pm

I think you are right with that observation. On one node it has just started again causing massive reads:

[8451136184016877735,3844120539144758204] storj.io/storj/storagenode/pieces.(*TrashChore).Run() (elapsed: 24h2m55.26998826s, orphaned)
 [6348737124026641307,3844120539144758204] storj.io/storj/storagenode/pieces.(*Store).EmptyTrash() (elapsed: 2m50.150998187s)
  [7309767258812830858,3844120539144758204] storj.io/storj/storagenode/blobstore/filestore.(*blobStore).EmptyTrash() (elapsed: 2m50.150997772s)
   [8270797393599020410,3844120539144758204] storj.io/storj/storagenode/blobstore/filestore.(*Dir).EmptyTrash() (elapsed: 2m50.150998758s)
    [4948374601870305559,3844120539144758204] storj.io/storj/storagenode/blobstore/filestore.(*Dir).deleteWithStorageFormatInPath() (elapsed: 76.419µs)
    [8455491530434153,3844120539144758204] storj.io/storj/storagenode/blobstore/filestore.(*Dir).walkNamespaceInPath() (elapsed: 2m50.151005695s)

And there is no config.yaml setting for this.

Alexey · May 30, 2023, 3:17am

It’s hardcoded

github.com

storj/storj/blob/4791a6422dd5b5a29f6fb036b73f3717671519aa/storagenode/peer.go#L494


      
          
          
		peer.Storage2.PieceDeleter = pieces.NewDeleter(log.Named("piecedeleter"), peer.Storage2.Store, config.Storage2.DeleteWorkers, config.Storage2.DeleteQueueSize)
          		peer.Services.Add(lifecycle.Item{
          			Name:  "PieceDeleter",
          			Run:   peer.Storage2.PieceDeleter.Run,
          			Close: peer.Storage2.PieceDeleter.Close,
          		})
          
          
		peer.Storage2.TrashChore = pieces.NewTrashChore(
          			log.Named("pieces:trash"),
          			24*time.Hour,   // choreInterval: how often to run the chore
          			7*24*time.Hour, // trashExpiryInterval: when items in the trash should be deleted
          			peer.Storage2.Trust,
          			peer.Storage2.Store,
          		)
          		peer.Services.Add(lifecycle.Item{
          			Name:  "pieces:trash",
          			Run:   peer.Storage2.TrashChore.Run,
          			Close: peer.Storage2.TrashChore.Close,
          		})

jammerdan · May 30, 2023, 4:00am

OMG!

MattJE96011 · May 30, 2023, 4:05am

Something really needs to change with this process or my Storj days might be coming to an end very quickly… which I really don’t want to happen. My server keeps crashing now due to this. Server has been running fine all month but had to reboot for other reasons. Reboot and minutes later the whole system locks up over and over again. Filewalker is off on all of them, yet 17 drives all pegged at 100%. Databases all run on an SSD pool but that literally makes no differene anyway. If I don’t start the nodes it runs fine. With them running CPU is only ~10% and RAM is 33/128 so no issues there. But again, all drives used for Storj are pegged. I don’t have any other tasks accessing the drives, it’s all Storj. There’s no way this is a sustainable if this is how it confinues to work.