Changelog v1.50.4

yeah its the most recent kernel version and the latest proxmox. 7.1
i use LXC containers and inside those i got docker :smiley:
its not a particularly elegant, but everything is updated to the most recent and has been working without a problem thus far.

clearly my issues relates to that the node shut down because it couldn’t get a connection to the updater.

the containers are also debian, liked to keep it so that the commands and everything was the same, because i was new at linux and had trouble enough as it was lol

so long storj short,

yes. :smiley:

One node is separate, but the 15 of them share two drives.

Before you say: «one node per drive» I say: «partial graceful exit».

1 Like

I knew your answer beforehand, so no - my answer is not what you already know. :slight_smile:

@Toyoo then you forced to use this suggestion:

For everyone who read this:
Yes, the one node per one disk, this suggestion will not change. This is for to do not have problems with filewalker (and many others, like slow on upload and download and thus more low success rate) and to do not have problems with updates like @Toyoo have.
And using several nodes per one disk is not ok, please do not reproduce. :slight_smile:
Our suggestions are based on experience, so better to follow them. They are not mandatory, but you at least know why.

3 Likes

Well, I can also hack the source code. Same thing, because unless the suggestion is documented, I cannot really depend on it without some customization work. Or, even, I can, but you will expect me to fix it again when you change your mind.

From my point of view this thread started nicely: Storj asks people to test code in various conditions. Sure, why not, that’s a great idea. Apparently though, my conditions are outside of the parameters Storj Labs expected. So, kinda disappointed, because I’ve already managed to work around watchtower, now this workaround becomes useless. I mean, why Storj asks to test code in various conditions if then they don’t accept the variety of conditions?

I recall Storj Labs asking what would make professional sysadmins set up nodes. Not doing stuff like this would be a nice first step. Storage is a low-margin business, and this ought to be additional billable work for them.

And, for everyone else who read this, I’ll just link to my previous post on large number of small nodes: Multiple Nodes per HDD to lower impact of node disqualification? - #8 by Toyoo

Yes, this solution has been implemented with expectation to follow our recommendations.
Your setup cannot be called standard, I think you would agree.
And yes, it’s not designed for “professional usage”, where you need to hack a code to make it work in edge case like yours.
We explicitly mention everywhere - “the one node per one disk”, so why do you expect the including to consideration of all setups which breaks this?

But it’s nice that you revealed this problem too. However I do not expect that it would be adapted though, sorry about this! But at least I know the workaround if someone follow your example.

2 posts were split to a new topic: Failed to settle orders for satellite “error”: “order: sending settlement agreements returned an error: timeout: no recent network activity”

Sure. But I’d like to remind you that Storj Labs was planning partial graceful exit for, IIRC, last year. I would understand Storj Labs expecting operators to follow guidelines if node operators could actually expect Storj Labs to follow their own plans.

I understand Storj Labs not calling my setup as following guidelines. I know it is not. But then, I just expect Storj Labs to understand my disappointment as well.

@Alexey we are still suppose to be able to do manual updates after this release right?

i kinda like to be able to choose an approximate time for when i run the updates, for various reasons,

  1. i like to verify that everything works after i’m done

  2. its nice that i can start the update process when i got free time.

  3. also a good time to make changes such as adjusting capacity, adding run parameters or other changes which will require a node reboot and thus will force a rerun of the filewalker.

  4. also a good stress test, but its difficult to log all the data, so it helps a lot being able to monitor the system live.

Updates will no longer happen by updating the docker image. They’ll happen automatically inside the container and only when it is your nodes turn. This will be different times for different nodes.

I don’t see an easy way to still update manually.

2 Likes

will be interesting to see how that pans out…

sort of knew this was coming eventually, we have been told so in the past…
and i understand the clear reasoning why they would want to do it this way,
so can’t really argue much against it lol…

even tho i would like to… lol

so only thing that changes is that i be forced to endure the filewalker a bit more…
when i have to make changes, atleast i don’t really have to change anything, business as usual :smiley:

as i understand it i don’t need anything else except the node running then?
or will i also be forced to install the watchtower thing?

when i did update to the new image it didn’t want to run because it couldn’t find an updater

Watchtower updates containers when a new docker image is available. When this goes live that probably won’t really happen anymore as the docker image doesn’t even contain the storj binaries anymore. It’s just a base image that then retrieves the binaries. Updater first, then it asks updater which version your specific node should be on and downloads that. From that point on the updater will monitor if there are updates available.

Now it may still be required to make changes to the base image and script that handles all that, but I expect that will be very infrequent. That said, I will keep watchtower running just in case.

The intention is for the latest tag to be replaced by this image though, so of it’s not working for you that needs to be fixed. Could you maybe post some more context to the logs you posted earlier. I did see in my logs that the storagenode service got stopped and restarted when I tested it on test net. Similar to what you saw. Did the container actually stop there? It may take slightly longer for the container to start since it actually needs to retrieve the binaries first.

2 Likes

I have a couple of ideas, but hope there is going to be an official way to manually update. If Storj does not want me to update too soon, I’m OK with being the last one to update, as long as I can watch the process and not have it happen when I am sleeping or away.

For honestly the autoupdates was introduced a long time ago. I would not expect any additional communications, unless something goes wrong.

The recommended and supported setup is to use docker node with watchtower. For these users there would not be any noticeable difference - but now nodes will be updated with the same mechanism as a Windows/Linux GUI/binary nodes using storagenode-updater in the same manner and the same cadence.

For operators prefers live on the edge I think there would be a bad news - the manual updates will be hard to handle now on.
As @BrightSilence described, the base image would not be updated too often, unless there are changes in the download script. All storagenode updates are handled by downloaded storagenode-updater binary inside the container. The watchtower could be still useful on case if the base image would change.

However, you can create a feature request to allow to specify the preferred time interval for updates or implement a pull request.

Having two SNs in the docker on one computer will they be updated at the same time?

No necessarily. The phased rollout gives every node its own time. So if they are not in the same tranche, it will happen at different times.

@Alexey or @littleskunk do you know what the tranches usually look like or does this differ per roll out?

Updated a small prod node.
Ubuntu LTS 18
Docker 20.10.12

Log is clean, up and downloads running, and filewalker doing it’s job.
Shows v. 1.50.4 on dashboard.

My Node stopped after the update. Maybe i forgot the --restart-unles-stopped parameter …

2022-03-23T17:03:17.595Z INFO Download started. {"From": "https://github.com/storj/storj/releases/download/v1.50.4/storagenode-updater_linux_amd64.zip", "To": "/tmp/storagenode-updater_linux_amd64.28022342.zip"}
2022-03-23 17:03:17,983 INFO exited: storagenode (exit status 0; expected)
2022-03-23T17:03:18.396Z INFO Download finished. {"From": "https://github.com/storj/storj/releases/download/v1.50.4/storagenode-updater_linux_amd64.zip", "To": "/tmp/storagenode-updater_linux_amd64.28022342.zip"}
2022-03-23 17:03:18,400 INFO spawned: 'storagenode' with pid 1901
2022-03-23 17:03:18,400 WARN received SIGQUIT indicating exit request
2022-03-23 17:03:18,401 INFO waiting for processes, storagenode, storagenode-updater to die
2022-03-23T17:03:18.401Z INFO Got a signal from the OS: "terminated"
2022-03-23T17:03:18.410Z INFO Restarting service. {"Service": "storagenode-updater"}
2022-03-23 17:03:18,414 INFO stopped: storagenode-updater (exit status 1)
2022-03-23 17:03:18,416 INFO stopped: storagenode (terminated by SIGTERM)
2022-03-23 17:03:18,416 INFO stopped: processes (terminated by SIGTERM)

I just started the node again, and its working.

Can i now go back to the :latest tag?

Your node has probably been updated to 1.50.4 now, changing to the latest tag would downgrade it to 1.49.5. At this point I don’t recommend that. You should probably wait until the latest tag has been upgraded to this new image. Though you could run this tag for a long time without issue as the updates now happen within the container. So there is no rush on switching back to the latest tag. I think for now, best wait until instructed to switch back.

psst… and this is why I said

4 Likes

:rofl:
You are sooooo right :smiley:

Ich changed the image on 2 other nodes and they instantly got the update to 1.50.4. So everything is fine.
In 4 Weeks i will “roll back” to :latest

you where right…

i must not have waited long enough…
don’t have the nodes on restart unless stopped, so i ran the docker run command and it terminated because it couldn’t talk to the updater according to the storagenode logs.

so shortly after maybe 30 sec or a minute i ran the docker start storagenode command and it again ran for a brief moment and then stopped again so i went back to latest.

today when i tried to start the node using the “docker test image”
5f2777af9-v1.50.4-go1.17.5

it started up without a hitch… i simply must not have waited long enough.
good catch…

@Alexey
so all okay from my side… tho it might be smart if the storagenode doesn’t stop because the updater isn’t done downloading yet… but that seems a bit like a minor oversight that really doesn’t change much and which will eventually be fixed…

Addendum
tried to update another node in a different container, and it didn’t behave like the first one…
but it did pull the image again ofc, so maybe the image was changed… or my zfs did some caching magic which made the updater instantly update…