Storagenode Docker Update: Changes To Binaries Location For Persistence Between Restarts

He has a point though… a good email to SNOs and a month notice would be helpful for all with important changes. See this case:
https://forum.storj.io/t/spawnerr-no-permission-to-run-command/27962?u=snorkel
What if that guy was in a cruising trip or couldn’t access his nodes for a month? They would all get DQed by Storj’s mods.

1 Like

I think we would not change the way of communication.

Maybe only to add an active notifications to the storagenode dashboard, but it is not in a roadmap too. They are almost static or provided by the node itself (like a DQ message or a Suspension message).

no, it’s not. But you may change the binary location in multiple ways:

You may also mount this location from a different drive or even a docker volume. The last one would also solve any permissions issues, I believe.

These changes that potentialy affect in a bad way the node functionality should be optional, not forced imposed on good old working nodes. Many don’t keep an eye on the nodes or their percormance, don’t reed the forum, and expect the nodes to work, once set up and tuned to their particular enviroment, and not be broken by Storj mods or updates.
This is the second time a change turned on by default negatively impacted the nodes?

  1. The lazzy file walker - a bad and untested addin, that many had to turn it off.
  2. The location of the storagenode binary that already broke some nodes.

Both were annunced by the same person, but I don’t want to point fingers.
Why this trend continuus?
When you make a change, announce it, make it optional, let people test it, and after some time, like months, if all is well and dandy, than you can turn it on by default.

And this thread dosen’t even stay in Announcements category where it belongs?
The title dosen’t draw attention to click on it and to read the thread, also. It shoud begin with “ATTENTION!..”
It happens that I click on almost every thread, that I clicked on it too. Didn’t know it was something important.

2 Likes

Thank you for your valuable feedback. I apologise for the inconvenience caused by the short notice.
Before rolling out this change, there were several factors that were considered:

  • This change fixes a very critical issue
  • This is basically a change that majority of SNOs have to do nothing.
    We stopped supporting watchtower a long time ago after we rolled out the auto-updated docker images. So for most setups, unless the container is manually restarted, they will continue to run the old images if the they’re using the :latest tag. So this post was actually to make the community aware of the change and what to expect.
3 Likes

Now it could explain to me, why you were need more time for testing.
How do you mount a storage there?

1 Like

Hi, i would notify to all that i have update the docker on UNRAID and it upgrade of the docker is normally and perfectly. I use the standard configuration for the path and the docker is updates automatically

3 Likes

Can it be the reason for this? All of a sudden I am getting the same error:

It could be possible, because the image should also got updates for the underlying OS.
However, I tested this new image for ARM64 and it didn’t have this issue.
This is usually happen, if a ca-certificates package is not updated.
The OS is updated by the way: Dockerfile: upgrade distro buster to bookworm (#22) · storj/storagenode-docker@7784e2a · GitHub

And the actual problem in the linked thread is:

thus a solution:

Seems the docker engine may be changed?

I too believe it’s completely and entirely your fault, and the purpose of these comments is not to jump on the banwagon of victim blaming.

Any software update can bring unknown number of new bugs.

If you are blindly updating to the whatever is pushed to latest tag you are choosing to live by someone else’s schedule and accepting that things will sometimes break, regardless on whether there is any communication about it, including for reasons entirely outside of storj — like os changes.

This was already addressed in the comment above Storagenode Docker Update: Changes To Binaries Location For Persistence Between Restarts - #18 by Ambifacient

By the way, this is precisely why storj deploys storagenode updates in waves. And yet you assume that container updates are inconsequential and update the container itself asap. Does not seem wise to me.

If you want more time — don’t update the containers, or any other software for that matter, automatically. Postphone it until you have time to read the release notes, update, and address all the possible fallout or revert.

And yet you still auto update software. I’m baffled we have to discuss this with someone that has been dealing with software “long enough”

It was communicated in the release notes. That you had practically infinite amount of time to read before installing the update. But you chose to blindly auto-update.

TLDR: there can be changes outside of what’s documented, bugs that sneak past QA, and therefore it’s on you to manage updates responsibly and not ingest any new piece of software just because it’s out.

2 Likes

I think we could do a change to affect less non-default setups, if we would download to /app/config/bin if there are no binaries, but then copy them to some folder inside the container, which is not exposed to the host and where we can set proper permissions, like /app/bin and use it to run binaries from there.

Right now there are three solutions:

  1. Add exec to the mount options in /etc/fstab, if possible (defaults includes it).
  2. Use a docker volume, however, this seems doesn’t work on QNAP for some reason: Node stopped working as docker container on qnap-container-station - #20 by beli
  3. Use the variable BINARY_DIR to redirect the binary folder inside the container.
1 Like

Just to let you know, I have a Qnap arm, the TS230, and the container upgrade went smoothly

2 Likes

Please do consider that approach, because this seems to have broken the very default setup on synology I use. I was able to add the permissions manually, but default synology shares seem to not work. The whole point of containerizing software is to make it able to run on a wide range of different environments. Don’t impose this kind of file rights management on SNOs when you don’t need to. Persisting the binaries is fine, but running them from an external file system is asking for issues. Copying them back into the container and managing rights there would fix all those issues.

5 Likes
5 Likes

What if updater updates the binary after the container has been started, outside of entry point?

Unix systems executable will point to the file inode. Updating the file changes the inode. The running process will remain pointed to the inode it started with

Wow, barely 3 hours after my post. Thanks @Alexey , that’s awesome!

2 Likes

That’s not a concern, and is irrelevant in the context of handling the shadow executables location.

Two possibilities here:

  • updater updates file in the app/bin location. Who copies it to app/config/bin?
  • Updater updates file in the app/config/bin location. Who copies it to app/bin?

I.e. updater needs to be aware of these two different locations, or there shall be some wrapper service that wraps the updater and handles copying the executable.

I don’t see it in the PR.

I see updater downloading new image is being only handled in the entry point. But updater runs every 15 minutes. What happens when one of those invocations result in a new image downloaded?

1 Like

Depends on which one. I introduced another one - BINARY_STORE, it’s basically a permanent storage on the data location by default. They would be copied when the container is restarted. However, if they are older than the current version, they will be replaced and copied back.
I do not know what would happen, if you put there a more new versions, but it’s easy to check :wink:

If you would mount other folder for the BINARY_DIR location inside the container then - depends on the OS. For Linux you may replace it, but likely it will try to run them only on restart, but before that they will be replaced by binaries from BINARY_STORE, which can be updated to the current version if they were outdated (see above).
For Windows I think you cannot replace them, they would be locked, however, I do not know how the 9p protocol (WSL2) or SMB/CIFS (Hyper-V) would handle that.

It’s running with these parameters:

So, storagenode-updater knows about the binary location $BINARY_DIR and there would be all binaries.
But you are correct, it wouldn’t copy them back to $BINARY_STORE. So issue would be spawned again on the next container restart after the update. If there would be an exception, the container would be restarted and this will fix it. If there wouldn’t be an exception, it will self update again.

What would you suggest?

This is an option:

Or, better yet, the change needs to be made in the updater itself — it would need to update the executable in both places, and then restart the node correctly.

Alternatively, it could write the most recently used node version it ever downloaded to data store, and then on container start update to at least that version. (I.e if cursor and mask indicates lower version — it will update to the last known higher version)

This will solve the original problem without storing executables in the data folder in the first place nor copy them around.