Update Process (Filewalker) + When to update

I think you are on your own there because you are running an unsupported configuration but there may some other ninjas who run it like you.

You should update to 1.16.1, it has gone to all the windows and now to docker too.

1 Like

Thanks for your answer!

I know, that network drives are “unsupported”, but my question is more in general.

What does the file walker do? Is it running after each and every update? Is it running in between updates?

What about this command? Why is it returning the actual version and not the minimum?

1 Like

OP’s mistake was admitting they’re not following the “official” procedure. Now we won’t get any answer for the actual question. Instead, we’ll get ten answers telling you how bad it is to not use the official updater.

Sorry for your loss, HeroHann.

3 Likes

I still have hope ;D

2 Likes

I think it only checks the metadata or presence of files, because on my node it usually only accesses the cache.

I also update manually, but I do it whenever there is a new version available. I think it ok to “stress the system” - better find out that a drive is failing quickly than wait until multiple drives have failed. I also run zpool scrub once a month (it is so by default and I have not disabled it) - again, better to find out that a drive is dying sooner rather than later.

1 Like

Thanks for your reply. Have you ever monitored the reads and writes on the drives after an update? Does the file walker have Abigails impact?

I usually wait for a week or so after a new version is available due to new bugs in new versions.

Can you tell me anything about this curl command?

1 Like

I’ve been after some more information on this too as my drive seems ‘busy’ for a week after an update or storagenade service restart.

drive.io.

The fact it’s not actually reading or writing much suggests it is checking metadata or file presence like @Pentium100 suggests.

The curl command just reads in the json data from the https://version.storj.io response. Try it in a browser.

1 Like

Thanks to you!

Yeah it looks like the file walker is not “doing” much but the drive is at 100%. Metadata as you said. Will try with my network drives.

To the command: Than I don’t get what this should be telling me. I thought it would tell me the last supported version, so I know when to update. But instead it tells me the newest version.

Where do I get the info, which is the latest supported version from which I really HAVE TO update?

I don’t have the official answer but it wouldn’t surprise me if 1.16.1 is now the latest and only supported version due to the recent order errors and fixes from the previous 1.15.3 version.

@Alexey might have the official answer.

1 Like

When processes.storagenode.minimum.version == latest version you should update else you know the PS part of your post.

Maybe I don’t get it or I am just dumb.

When do I have to update?
I am running 1.15.3 now and 1.16.1 is out (some people are having problems).
When is my time to absolutely do the switch?

PS part?

Since you are not relying on automatic update then you should update when the rollout is 100% i.e. all nodes can update GUI/docker/others.

Post Script.

Yeah I SHOULD update but as you might have seen in the forum some people are running into problems. So with everything I own I wait some days / weeks to update if it is not absolutely necessary. Thats why I ask not when I should but rather when I MUST update.

yeah same here… delayed my update just to wait and see if more problems start flooding the forum.

within the next month or so… i think the rule is 3 versions behind… i forget if thats called major or minor… so on 1.15.3 the latest you should updated would be 1.17.x… maybe 1.18.x i suppose that depends on how the 3 version difference is counted… i mean 1.15.3, 1.16.1, 1.17.x would be a range of 3 versions from when it was updated. making 1.18.x when one would get suspended… while if one says 1.15.3 + 0.3.0 then it would be 1.18.x that would be the last and 1.19.x that would get one suspended…

but i would personally try to stay close to the most recent rather than lagging behind…

haven’t really looked into all that because i don’t plan on getting that far behind.

if there isn’t any major issues for people that keep popping up on the forum i expect to update in the next few days.

and was it a suspension tho… or was it a DQ i forget… like i said, you don’t want to test that stuff really, especially if the punishment was DQ… i remember getting a bit insulted by it, so may have been DQ

but not really relevant because i don’t plan on finding out where that limit is… even if i may skip 1 version if it turns out to be problematic… ofc skipping one update makes one’s node slightly different from the norm and might in itself cause problems long term… in theory atleast… but who knows…

the filewalker has so far as i’m aware been running on every reboot of a node… not sure if that got changed, i have complained a bit about it.
the filewalker is very iops dependent, thus the slower your storage solution iops the longer it will take.
my 14.5tb node takes a few hours usually, so it’s really annoying to troubleshoot…
but i got large scale caching and high iops so not to bad… but i have done testing running on 1x hdd iops, 2x hdd worth iops and 3x worth of iops and it basically scales linear.

so in theory on a system of say 30TB with 1x 7200 rpm hdd iops and no caching the filewalker seems to take upwards of 8 hours or more, tho there are some peak utilization in graph… but it still takes a long time to finish…

there really isn’t any fix for it, aside from caching or having good read iops.

a poorly configured raid array of 30TB filled with data doing the filewalker might take a day or more to complete…

1 Like

Well, it says on dashboards that 1.13 is the minimum version:
image

I read somewhere that the oldest version would only have a limited support, as in you wouldn’t receive anymore data or something like that… Can’t find it back though.

@HeroHann I think you could delay your updates a bit, for instance you could be one version behind. But I wouldn’t go beyond that, as StorjLabs clearly want all nodes to be as up-to-date as possible.

1 Like

yeah when you go below the minimum required version ingress will go to zero.

The filewalker is a horror on my SMR drives; especially if you are trying to do anything that might cause a large amount of writes (I changed some BTRFS mount settings to disable CoW, and needed to ‘apply’ to the existing data by copying to a new spot or running a defrag), while drives slowed down to < 10 MB/s total throughput… Finally finished… The cumalitive load of everything was not good; long story is definitely avoid SMR for StorJ if you can, although you can run it… A RAID might help, I haven’t tested that.

1 Like

That perfectly describes what it does to my SMR drives too ^^’
It will become worse and worse the more data disks are storing by the way, maybe something should be considered to make this process lighter on drives!

Having several nodes (on several disks obviously) does help too, as it spreads loads on all disks.

1 Like

actually raid setup with SMR would be about the same… since it’s the iops that’s the SMR HDD limitation and with a standard raid array the iops is the same as with a single HDD due to all raid drives writing in harmony with each other or whatever one wants to call it.

the best way to utilize SMR drives will be using multiple smr hdd on multiple nodes and thus the data sent is split across them… ofc that doesn’t make the filewalker any less of a problem, tho since each drive would essentially only have half the data then it should, atleast in theory take about half the time.

so if you have a number of SMR drives making multiple nodes is a pretty good fix, tho when they are full the problem will be the same in regard to running the filewalker…

raid only increases the write and read bandwidth, not the iops… sadly… would be wonderful if it did, but yeah bandwidth does increase, giving you HDD MB/s * (Number of drives - Redundancy Equal Drives)
so in a raid 6 with 6 drives you get 4 x base speed because 2 of the drives are the redundancy.

plainly put, as always it’s a bit more complex than that.

1 Like

The information you are looking for is in the thread that you linked, first post.

To summarize, the version numbering works as follows: <major>.<minor>.<bugfix> So when minor is one version behind, you are okay. When it’s two minor versions behind, you are in a form of suspension but you can be redeemed. Three minor versions behind and your node will be DQ’d.

So never be more than one minor version behind and you have nothing to worry about. I don’t know if Storj’s versioning policy allows for skipping minor version numbers (ie. v1.16.1 skips to v1.18.x), but you could probably take the current minor version number from version.storj.io, subtract 1, and that is your allowed version. Your best bet might be to set yourself up a script that polls this value periodically (once per day would be fine) and notifies you when it changes. Or subscribe to the Engineer Discussions category, which is where changelogs are posted. Or subscribe to updates from Github.

3 Likes