Upcoming storage node improvements including benchmark tool

Alexey · June 16, 2024, 10:34am

However, NFS is not supported too. So you likely would have another sets of problems. Not only disabled cache (and thus sync writes).
Unfortunately it will not only affect SQLite, as mentioned by @Pentium100 , but also a regular blobs. Sure, it can work until doesn’t… The lock mechanism is not similar to a local FSes unfortunately.
They solved this problem only for Windows host - Windows guest, using a SMB V3, but not for hybrids.

We are not against the network FSes, they just not always works and we do not willing to support them, too much work and likely we always would have issues.
However, sometimes it can work stable, see

Alexey · June 17, 2024, 4:58am

Yes, I know, NFS should be fine. But sometimes not: Topics tagged nfs

Alexey · June 17, 2024, 6:45am

Yes, but nobody shared how is it - “properly”.

Alexey · June 17, 2024, 7:13am

I more interested in the answer, how to configure it properly not for VMs, but for storagenode which uses NFS to mount data.
I guess that localhost NFS should work properly by default, if the server and the client uses the same version. This is also true for SMB/CIFS (Windows Hyper-V uses SMB to share drives to the VMs). And with such setups storagenodes should work, even with SQLite databases.
But if the server and the client on different PCs, then it become not so compatible and there we saw linked issues with the tag above.

However, we do not tests the network filesystems for storagenode, thus cannot say that these setups are supported. You may use it, of course, but usually it ends like “it was working before!” when the node grows enough.

Pentium100 · June 17, 2024, 7:56am

Interesting, at least for me, my node appears to be CPU limited. I noticed that during high traffic the CPU load is almost 400% (all 4 cores used), so I decided to give the VM 8 cores instead.
I also ran the benchmark before and after (with the node stopped this time)
4 cores (4 queues on virtio-scsi controller):

uploaded 100000 62.07 KB pieces in 51.325401008s (115.33 MiB/s, 1948.35 pieces/s)
collected 100000 pieces in 9.184188323s (644.51 MiB/s)

8 cores just after VM reboot (also 8 queues)

uploaded 100000 62.07 KB pieces in 34.251325496s (172.82 MiB/s, 2919.60 pieces/s)
collected 100000 pieces in 9.604400021s (616.31 MiB/s)

I did not run the test on the host at this time, will probably run it when I need to expand the node.

Alexey · June 17, 2024, 8:23am

Perhaps it could be related to how your hypervisor presents these disks to the VMs.
Do you use an NFS too?

Pentium100 · June 17, 2024, 9:53am

No, I use a zvol and then attach it to the VM as a block device (virtio scsi).
Then again, the X5687 is not a super fast CPU.

One annoying thing I noticed is that after rebooting the VM (so its cache is empty) and starting the node, the filewalker initially creates a high IO load that may even cause the node to restart. I updated my script to limit the filewalker process even more, so I guess I’ll see what happens the next time.

Toyoo · June 17, 2024, 7:48pm

If you’re fine with having your node offline for a bit, on ext4 I found forcing an fsck an effective way to load up all metadata into kernel caches. Effective as in, quite a bit faster than a du or a file walker. This is because fsck on ext4 can do sequential reads when loading up inodes, as opposed to cherry-picking the ones that happen to be in a given subdirectory.

This is not true.

NFS: Storj developers do not care. Maybe it can be fast enough, maybe it doesn’t, maybe it’s reliable, maybe it will lose data, but SNOs have no leverage altogether and you are on your own.

ext4/NTFS: Storj developers claim this should work, so if it doesn’t, this is a hint they will look into fixing performance/durability issues.

This is what supported means. It doesn’t really have anything to do with actual reliability or performance. Even if NFS was always 100× faster and more durable, stating it’s not supported means Storj does not care about it.

And even in this hypothetical case where NFS was always 100× faster and more durable Storj stating they don’t support it may make sense. This is one more configuration to test against on top of many others that are already supported. A configuration that is not as popular, making the efforts made to test against it of lesser value. So it’s more like game developer deciding not to support Linux/MacOS despite these OSes being better under some metrics—most gamers are on Windows.

Toyoo · June 18, 2024, 10:54am

This is an official recommendation by Storj:

Please consider backing up your data and reformat the disk to a native filesystem for your OS (NTFS for Windows or ext4 for Linux) and then perform a restore to place your data back on the drive.

(Step 1. Understand Prerequisites - Storj Docs)

Prerequisites are part of node T&C by this clause:

4.1.4.2.5. meet all performance requirements referred to in this Agreement, as well as any performance requirements set forth in the Documentation or in other instructions made available with the Storage Node Software or otherwise hereunder;

Pentium100 · June 20, 2024, 11:15am

I just found something interesting (I do not know if I should post it to this thead or about the one about test data, but this is about performance, so…).

I upgraded my node VM to Debian 12 (it was Debian 10). Apparently something changed in the newever version because the VM started experiencing load spikes every 20 minutes, causing the incoming traffic to almost stop.

graph_image-3
(note the log scale)

I tried a bunch of things, like reducing the queue or core count, but it did not help. The load would shoot up for a short time with not much indicating what’s causing the load.
And then I found it (I think), apparently Debian 12 does async writes differently, I guess it accumulates a bunch of dirty data and tries to write it at the same time causing the load spike.

I restarted my node adding --filestore.force-sync=true to the command line and this fixed the problem

graph_image-3

No more load spikes. I went back to 8 queues/8 cores on 13:00 and my node seems to work good, even getting slightly more traffic (I have no idea if this is the max or if my node is still limited).
I probably should change the pool into mirrors, but cannot do so at the moment.

Alexey · June 26, 2024, 6:28am

Now I found a time to do that finally
Virtual disk (the file is located on the y: disk):

PS e:\> ~\piecestore-benchmark-4.exe -pieces-to-upload 100000
uploaded 100000 pieces in 11m28.9337046s (8.59 MiB/s)
collected 100000 pieces in 6m7.3411956s (16.11 MiB/s)

Physical disk Y:

PS Y:\benchmark> ~\piecestore-benchmark-4.exe -pieces-to-upload 100000
uploaded 100000 pieces in 17m46.8140263s (5.55 MiB/s)
collected 100000 pieces in 7m3.3476814s (13.98 MiB/s)

KernelPanick · August 20, 2024, 4:22am

I’m trying to run on ubuntu 22 lts, any ideas what i am doing wrong here?

~/go/bin/piecestore-benchmark -pieces-to-upload 1000
panic: main.go:76: migrate: database: info opening file "storage/info.db" failed: database is locked

goroutine 1 [running]:
github.com/dsnet/try.e({0x1be2220?, 0xc000300060?})
        /home/storj/go/pkg/mod/github.com/dsnet/try@v0.0.3/try.go:206 +0x65
github.com/dsnet/try.E(...)
        /home/storj/go/pkg/mod/github.com/dsnet/try@v0.0.3/try.go:212
main.CreateEndpoint({0x1c07d80, 0x2abd860}, 0xc000119da0, 0xc00023c0c0)
        /home/storj/storj/cmd/tools/piecestore-benchmark/main.go:76 +0x20f
main.main()
        /home/storj/storj/cmd/tools/piecestore-benchmark/main.go:199 +0x2a7

and if i run again

panic: main.go:75: open storage/trash/.trash-uses-day-dirs-indicator: file exists

goroutine 1 [running]:
github.com/dsnet/try.e({0x1be0880?, 0xc000186540?})
        /home/storj/go/pkg/mod/github.com/dsnet/try@v0.0.3/try.go:206 +0x65
github.com/dsnet/try.E1[...](...)
        /home/storj/go/pkg/mod/github.com/dsnet/try@v0.0.3/try.go:220
main.CreateEndpoint({0x1c07d80, 0x2abd860}, 0xc000113ce0, 0xc00023a000)
        /home/storj/storj/cmd/tools/piecestore-benchmark/main.go:75 +0x1e8
main.main()
        /home/storj/storj/cmd/tools/piecestore-benchmark/main.go:199 +0x2a7

Alexey · August 20, 2024, 4:24am

You need to remove that file

KernelPanick · August 20, 2024, 4:25am

I’m back to error 1 when i remove storage/*

Edit: although my storage on this test array is very busy doing other tasks right now - i’ll try again later when it’s quiet.

Alexey · August 20, 2024, 4:26am

You need to delete also all databases in the benchmark folder to allow the tool to recreate them.

KernelPanick · August 21, 2024, 12:41am

Is the instructions in post 1 still valid? I’m having issues with the go install %var%

downloading github.com/census-instrumentation/opencensus-proto v0.4.1
# storj.io/storj/shared/dbutil/sqliteutil
shared/dbutil/sqliteutil/db.go:86:28: undefined: sqlite3.Error
shared/dbutil/sqliteutil/db.go:87:25: undefined: sqlite3.ErrConstraint
shared/dbutil/sqliteutil/migrator.go:104:24: destDB.Backup undefined (type *sqlite3.SQLiteConn has no field or method Backup)

Alexey · August 21, 2024, 4:50am

I used this Dockerfile to build them for my nodes:

FROM golang as build
RUN git clone https://github.com/storj/storj.git && \
    cd storj && \
    go install ./cmd/tools/piecestore-benchmark && \
    go install ./cmd/tools/filewalker-benchmark

FROM ubuntu
WORKDIR /benchmark
COPY --from=build go/bin/piecestore-benchmark /usr/bin/
COPY --from=build go/bin/filewalker-benchmark /usr/bin/

Then build it:

docker build . -t storj-benchmarks

Now you may test it:

docker run -it --rm -v x:\benchmark\:/benchmark storj-benchmarks piecestore-benchmark
docker run -it --rm -v x:\benchmark\:/benchmark storj-benchmarks filewalker-benchmark

Additional options you may see with -h argument

example:

docker run -it --rm -v x:\benchmark\:/benchmark storj-benchmarks piecestore-benchmark
uploaded 10000 62.07 KB pieces in 6m1.0884103s (1.64 MiB/s, 27.69 pieces/s)
downloaded 10000 62.07 KB pieces in 16.2602328s (36.40 MiB/s, 615.00 pieces/s)
collected 10000 pieces in 1m38.1718544s (6.03 MiB/s)

KernelPanick · August 21, 2024, 5:20am

i was thinking about doing this myself next. saved me some time, thank you!

vladro · September 14, 2024, 4:24pm

we got a built-in benchmark tool in 113?
storagenode-benchmark
can we get also any instructions how to use it or it is still prereleased version:)?

Alexey · September 15, 2024, 3:25am

I did notice it too and trying to figure out how to run it.
What I figured out so far: you can request a help with the -h option, and see that it has a required command parameter

> docker run -it --rm -v x:\benchmark\:/benchmark storj-benchmarks storagenode-benchmark -h
Usage:
  /usr/bin/storagenode-benchmark [command]

Available Commands:
  benchmark   upload/download benchmark against storage node
  completion  Generate the autocompletion script for the specified shell
  help        Help about any command
  satellite   starts fake satellite
  version     output the version's build information, if any

But no explanation, how to run a fake satellite (and it’s needed) or how to provide an address of that satellite to the benchmark subcommand. Also both requires an own identity.

Unfortunately the description of the commit didn’t help too much. I would try to find the PR in Gerrit, maybe there are commentaries with examples.

found it: https://review.dev.storj.io/c/storj/storj/+/13105