Tuning the filewalker

The storage node itself doesn’t know if it is started after an update, reboot, system crash or manually. That doesn’t mean anything because my strength is more the code that is already written and not so much the code change needed for this. It might still be a relative short code change.

Same for the ionice level. It can be set in golang. I haven’t seen it in our code and so I am unable to provide any code examples. It might also be a relative short code change.

2 Likes

thank you.

yeah, but if one did so the default was the filewalker turned off, and then the update of the storagenode software turned the filewalker back on temporarily, so that next node startup it wouldn’t run again.

then the filewalker would end up running staggered like the updates, i know it’s a bit of a patch work method, but thats to limit the complexity of the solution concept, to make it easy to work with.

and like i stated before there are some issues another one i just realize is that, if the filewalker was stopped for whatever reason, then it won’t have finished… which might be why it always runs currently.

to ensure that it never ends in a state where its unaware of the used capacity.
but thats ofc just me guessing.

stuff gets complicated so quickly and often there are reasons things are done a certain way to begin with…

i’m sure if there was an easy solution it would have been done already… or most likely…
its not always easy to see the forest for trees

The L2ARC seems to help a lot. GC runtime is down to a few minutes. I think the main difference is that the L2ARC survives a reboot.

2 Likes

yeah the L2ARC is pretty amazing for all kinds of things, mine takes on avg 5-10% of the ARC IO
which would have had to come from HDD instead, also isn’t an insignificant amount… because its comparable to 1/5 or 1/10 of my total HDD IO during sequential reads.

and what the L2ARC mostly deals with is random IO, also a lot of configurations one can do for the L2ARC but i generally just keep it at the defaults.

i found that changing the logbias from latency (default) to throughput has amazing results also.
using the commands
zfs set logbias=throughput poolname

display the current setting with
zfs get logbias poolname

and returning to the default
zfs set logbias=latency

i have seen near 10x better performance in some cases, ofc it does increase the latency of the pool, but really for 10x more throughput… then running it on latency just makes it so much slower that everything takes 10x the time and then more work just piles up…
so i rather take the latency penalty and run max throughput.

haven’t really looked at the changes in the latency from doing this tho…
seems to work amazing, but only been using it for maybe 3 months… so i might still run into issues in the future… sometimes it takes a while to find the downsides.

also keep in mind these are rough numbers and haven’t investigated it well…
i think when i saw the 10x results it was on a windows vm, so can’t say if its 10x for storj data… however i initially started using it on storj which was why i ended up trying it on the windows vm disk.
It has become my defacto default setting for all zfs pools, so far with no noticeable ill effects.

oh yeah and remind me again… what is GC runtime?
think you might have told me before, but i can’t remember what it is.

1 Like

It’s garbage collection

1 Like

The pull request have been merged!

On the next release we will be able to control the initial filewalker process with:

storage2.piece-scan-on-startup: true|false

on the config!

15 Likes

Sorry for the extra work with getting the pull request merged. We are currently chaning our build system and the unit tests are not running as stable as they used to be.

3 Likes

No problem at all! Most of my issues were that I’m not really using Golang on my day to day basis (I’m an Elixir programmer :joy: ) but in general it was a nice experience :slight_smile:

4 Likes

Really nice, thanks a lot!
What would be the syntax to set this in the docker run command?

1 Like

add -- before the option, and value after the option. All options added as an argument must follow the image name in the docker run command, i.e.

docker run -d ... storjlabs/storagenode:latest --storage2.piece-scan-on-startup=false
4 Likes

Is there a way of implementing this in docker compose such that it can be decided each time the command is run? Like
docker-compose up -d storjnode --storage2.piece-scan-on-startup=false
and
docker-compose up -d storjnode --storage2.piece-scan-on-startup=true
(I know those commands wont work)
Or would I have to use docker run?

You need to modify your docker-compose.yaml

services:
  storagenode:
...
  command:
    - "storage2.piece-scan-on-startup=true"
...

then docker-compose up -d
For the false the same - modify docker-compose.yaml and run docker-compose up -d

This way you may also be sure, in what mode did you run it - just take a look on your docker-compose.yaml file.

1 Like

dang. I was hoping I wouldnt have to change the compose file each time. Thanks!

scripts can be powerful tools, docker compose is basically just something similar, to my understanding anyways.

if it can’t do what you want it to, i’m sure you could make some sort of bash script or such to do the docker run command instead.

I have quite a few, but I was just hoping I could have my config in one place so if I modify it I don’t have to change multiple places. I’ll look into using docker run in conjunction with compose. My compose is rather intricate. My goal is to be able to manually start it without the filewalker (if my server needs to restart), then have a script or cron job restart it later so that the filewalker runs. And have the filewalker be the only difference.

You may use docker-compose run too, however, it will overwrite command from your docker-compose file and port mappings will be ignored, unless you specify --service-ports option: docker compose run | Docker Documentation

1 Like

Im diving into this and I think it will work!!! If you have multiple nodes you want to start all together, then have run filewalkers staggered, you can totally do it with this. Thank you!

1 Like

Which version is it in?

INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "storage2.piece-scan-on-startup"}
INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "storage.allocated-disk-space"}

INFO	Running on version	{"Process": "storagenode-updater", "Service": "storagenode-updater", "Version": "v1.62.3"}
INFO	Downloading versions.	{"Process": "storagenode-updater", "Server Address": "https://version.storj.io"}
INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode", "Version": "v1.62.3"}

````Preformatted text`

It is in 1.62.3, which is currently being rolled out.

However, those messages are from the storagenode-updater, which will always be there as it uses different configuration keys (and it is totally normal to see them in the logs)

3 Likes

Hi everyone,
I’m having trouble disabling the filewalker, my run command looks like this.

sudo docker run -d --restart unless-stopped --stop-timeout 300 
-e WALLET="***"  -e EMAIL="***"  -e ADDRESS="***" 
-e STORAGE=1700Gb
--mount type=bind,source="/mnt/hdd2/storj0/identity/storagenode",destination=/app/identity 
--mount type=bind,source="/mnt/hdd2/storj0",destination=/app/config 
--mount type=bind,source="/home/storj/databases/storj0",destination=/app/dbs 
--name storagenode0 storjlabs/storagenode:latest 
--storage2.piece-scan-on-startup=false 
--operator.wallet-features=zksync

Can anyone see any obvious problems ? Or maybe the IOWAIT is caused by something else than the filewalker.
Here’s a look at my IOWAIT when the nodes restart…

Thanks a lot !