I believe the team wouldn’t estimate this improvement until they would have an issue on GitHub. Or maybe even better - the Community can submit a PR with the needed change, it shouldn’t be too complicated I think.
Their Docker desktop with WSL2 engine works much better than with Hyper-V VM.
You may also consider to use Podman.
However, docker is running under a Linux VM anyway, either Hyper-V or WSL2.
This might be a stupid question… Instead of having a copy of file metadata in a key/value store, wouldn’t it be possible to have everything in such store?
Wasn’t that part of the problem with the old V2 network? Sticking too much filesystem-like data into database-like structures?
We have had this in the past, we stored data in KFS
And we investigating of possibility to store pieces in alternative systems:
I’ve changed over to badger cache. No problems so far
Same - except I had to disable lazy filewalker to get past a startup error. But I could swear I read somewhere in the forums that they could both be on.
Would be wonderful news, but probably you misread…
I’ve changed over to badger cache. No problems so far
what platform? (docker/windows/linux/etc)? is the new cache located on the same drive as the data?
linux/docker
du --si -s /node?/node?/storage/filestatcache/
966M /node1/node1/storage/filestatcache/
942M /node2/node2/storage/filestatcache/
706M /node3/node3/storage/filestatcache/
210M /node4/node4/storage/filestatcache/
173M /node5/node5/storage/filestatcache/
151M /node6/node6/storage/filestatcache/
df -H --total /node*
Filesystem Size Used Avail Use% Mounted on
/dev/sdb 18T 2.8T 16T 16% /node1
/dev/sdc 18T 3.5T 15T 19% /node2
/dev/sdd 18T 2.3T 16T 13% /node3
/dev/sde 18T 1.7T 17T 10% /node4
/dev/sdf 18T 1.5T 17T 9% /node5
/dev/sdg 18T 1.5T 17T 8% /node6
total 108T 13T 95T 13% -
I just restarted node6 and it looks like the dashboard updated in less than 10 minutes
so if I read that right the 3.5TB disk has a badger cache size of almost 1GB
And you are storing the badger on the same spinner disk at the data?
would you see any reason to try to migrate the DB to a SSD?
would you see any reason to try to migrate the DB to a SSD
Not yet, but my nodes are only small
Here is a possible problem :
docker exec -it storagenode1 /app/storagenode issue-apikey --config-dir config --identity-dir identity
Error: Error starting master database on storage node: Cannot acquire directory lock on “config/storage/filestatcache”. Another process is using this Badger database. error: resource temporarily unavailable
i had this error when i wanted to start my 2nd node with badger… the cache was placed to the same ssd but in different folder… couldn’t solve it yet
It’s not solvable by moving a badger cache to the SSD. This is a fundamental limitation of the current implementation of the badger cache - only one process can use it in the same time.
@andrew2.hart
When you issue the command
docker exec -it storagenode1 /app/storagenode issue-apikey --config-dir config --identity-dir identity
it’s starting the second process which wants an exclusive lock for the badger cache too…
The workaround could be is to stop the node and issue this command, however, I do not know how to do it when the container is stopped. Seems it’s possible only with a binary.
The other way is to disable the badger cache and restart the node, then issue the command, then enable the badger cache back and restart the node again.
P.S. I enabled the badger cache on my docker node and I can issue the API key either by issue-apikey
or info
.
Badger cache is not compatible with lazy file walker (and it doesn’t need lazy file walker anyway…)…
In terms of “is it ready or not?”: I would say, it’s not yet. I am testing it right now (with 100m pieces). But testing walkers with 100M segments is very slow…
But risk is very low, if you would like to test it as well (but please, turn off lazy file walker…)
My test results are not looking like an improvement. My test setup is a bit different. I am avoiding running the used space file walker. I am testing the performance of the TTL cleanups and garbage collection. Technically I should also test trash cleanup but I haven’t captured it yet.
Badger cache enabled:
[8026632812200767755] storj.io/storj/storagenode/pieces.(*Store).DeleteSkipV0
parents: 1985439348781840182
current: 0, highwater: 1, success: 45375, errors: 9342, panics: 0
error pieces error: 9342
success times:
0.00: 223.538µs
0.10: 6.847638ms
0.25: 12.785451ms
0.50: 22.782667ms
0.75: 34.343981ms
0.90: 49.302596ms
0.95: 53.812656ms
1.00: 223.803824ms
avg: 27.04915ms
ravg: 27.955116ms
recent: 10.36715ms
sum: 20m27.355204871s
failure times:
0.00: 1.422654ms
0.10: 6.568926ms
0.25: 11.154435ms
0.50: 17.452462ms
0.75: 24.839362ms
0.90: 27.734887ms
0.95: 34.560739ms
1.00: 59.615096ms
avg: 19.947786ms
ravg: 18.057496ms
recent: 20.685688ms
sum: 3m6.352222938s
[1510541944505505774] storj.io/storj/storagenode/retain.(*Service).trash
parents: 550368326432921046
current: 0, highwater: 1, success: 476649, errors: 0, panics: 0
success times:
0.00: 54.185µs
0.10: 142.279µs
0.25: 7.411701ms
0.50: 11.55822ms
0.75: 15.334614ms
0.90: 28.058159ms
0.95: 31.842056ms
1.00: 72.302272ms
avg: 13.234991ms
ravg: 12.960566ms
recent: 1.187403ms
sum: 1h45m8.445512217s
failure times:
0.00: 0s
0.10: 0s
0.25: 0s
0.50: 0s
0.75: 0s
0.90: 0s
0.95: 0s
1.00: 0s
avg: 0s
ravg: 0s
recent: 0s
sum: 0s
Badger cache disabled:
[4473966134940083155] storj.io/storj/storagenode/pieces.(*Store).DeleteSkipV0
parents: 8702463535031354877
current: 0, highwater: 1, success: 16244, errors: 32398, panics: 0
error pieces error: 32398
success times:
0.00: 93.314µs
0.10: 5.066005ms
0.25: 7.742888ms
0.50: 14.198097ms
0.75: 21.362198ms
0.90: 28.923302ms
0.95: 31.074866ms
1.00: 33.835768ms
avg: 23.519405ms
ravg: 15.325673ms
recent: 719.057µs
sum: 6m22.049219392s
failure times:
0.00: 491.223µs
0.10: 11.604046ms
0.25: 13.840483ms
0.50: 17.9587ms
0.75: 22.726396ms
0.90: 34.498565ms
0.95: 48.014309ms
1.00: 79.911616ms
avg: 20.512728ms
ravg: 21.45098ms
recent: 11.115074ms
sum: 11m4.571387332s
[7355981312936367727] storj.io/storj/storagenode/retain.(*Service).trash
parents: 3266766714001306064
current: 0, highwater: 1, success: 432196, errors: 0, panics: 0
success times:
0.00: 39.462µs
0.10: 64.206µs
0.25: 72.022µs
0.50: 87.249µs
0.75: 107.401µs
0.90: 178.037µs
0.95: 193.334µs
1.00: 750.4µs
avg: 249.035µs
ravg: 113.07µs
recent: 98.999µs
sum: 1m47.632314463s
failure times:
0.00: 0s
0.10: 0s
0.25: 0s
0.50: 0s
0.75: 0s
0.90: 0s
0.95: 0s
1.00: 0s
avg: 0s
ravg: 0s
recent: 0s
sum: 0s
You might notice that garbage collection is a lot faster with cache disabled. I was forced to update my storage node and what you see there are some performance improvements and not the badger cache.
Forgot one. There is an improvement in a different subfunction of garbage collection
Badger cache enabled
[8500868933913357560] storj.io/storj/storagenode/pieces.storedPieceAccess.ModTime
parents: 7713317249650817796
current: 0, highwater: 1, success: 2332913, errors: 1, panics: 0
error System Error: 1
success times:
0.00: 14.13µs
0.10: 23.702µs
0.25: 39.926µs
0.50: 70.009µs
0.75: 8.438554ms
0.90: 12.517875ms
0.95: 13.634868ms
1.00: 26.931204ms
avg: 3.628026ms
ravg: 3.946927ms
recent: 90.463µs
sum: 2h21m3.870431453s
failure times:
0.00: 56.314µs
0.10: 56.314µs
0.25: 56.314µs
0.50: 56.314µs
0.75: 56.314µs
0.90: 56.314µs
0.95: 56.314µs
1.00: 56.314µs
avg: 56.314µs
ravg: 56.314µs
recent: 56.314µs
sum: 56.314µs
Badger cache disabled:
[3509109613512306876] storj.io/storj/storagenode/pieces.storedPieceAccess.ModTime
parents: 689069401672188854
current: 0, highwater: 1, success: 2206023, errors: 0, panics: 0
success times:
0.00: 10.074µs
0.10: 12.127µs
0.25: 15.541µs
0.50: 33.018µs
0.75: 54.402µs
0.90: 11.865786ms
0.95: 13.124615ms
1.00: 25.862456ms
avg: 7.05084ms
ravg: 2.504681ms
recent: 10.296µs
sum: 4h19m14.317100538s
failure times:
0.00: 0s
0.10: 0s
0.25: 0s
0.50: 0s
0.75: 0s
0.90: 0s
0.95: 0s
1.00: 0s
avg: 0s
ravg: 0s
recent: 0s
sum: 0s
Please consider possibility to move cache to SSD, so make path configurable. It will speedup its work, even on first run, so node then need less read/write, it will read on node and write cache to ssd. Also read Cache from SSD is much faster.
If you are on docker you can remap the cache folder to ssd