Storagenode Docker Update: Changes To Binaries Location For Persistence Between Restarts

This is was my expectation for the solution by the way. Just when you file a bug report, you should not suggest a fix. Because when the customer is saying that there is an issue, they probably right, when they suggest how to solve it, usually they are not.
But I think it wouldn’t possible without changing the updater, I have no idea how to wrap it in another way.
There is a problem that we use a supervisord, which would start both services, and then the updater will check the version and will try to update it, however, if the node would die during the process, the whole container will be finished.

Without change the updater, it will become a very complicated start sequence:

  1. You still need to download both minimal versions, storagenode and storagenode-updater (otherwise the updater wouldn’t have a storagenode binary to compare versions)
  2. You need to start the updater first and do not start the node.
  3. The updater should download a new version and run the node either a previously downloaded version or download a new one, unpack and run it.
  4. The updater should download a new version of itself if needed and exit or respawn itself in the background to continue the startup procedure.
  5. We should configure supervisord to control now both binaries
  6. supervisord has been started controlling both binaries.

So, there would be probably two versions of supervisord configurations - one only with the updater, the second - with both and teach the updater to reconfigure supervisord.
Or, only one, but the first we need to run the updater and it should either respawn itself in a background or exit, then configure supervisord to handle now running processes. :thinking:

I think I fixed an issue: fixed exec issue for restricted systems by AlexeyALeonov · Pull Request #25 · storj/storagenode-docker · GitHub

1 Like

Don’t you need to check if the should_update wants to downgrade the node, and prevent it from doing so?

The idea is that downgrade should not be allowed; if the cursor resets then updater may decide to “update” to an old version.

If I misunderstand the shall_update behavior of updater, and it never suggests to doengrade — then it’s good.

Yes, it will not downgrade. However, there is one more issue - it will forcibly upgrade the node even if it’s not allowed yet.

Fixed there:

1 Like

Sure, but what if /tmp:

  1. does not have +x allowed?
  2. is cleared f.e. from crontab?

I think my PR should solve both issues. It’s mostly relay on an how docker handles containers, so /app/bin should be safe - it’s inside the container, and must not have any restrictions from the host there.
Of course, some exotic setup would have an issue to run binaries inside the container. But well, they are - exotic, and the operator must know, how to solve it for their isolated use case.

1 Like

Why is the binary not in the Docker image?

To decouple container updates from storagenode updates. Storagenode updates often. Container — ideally never.

That’s sounds strange. The idea of Docker is that the container has everything

It does have everything to manage and run the storagenode software. What is strange?

I understand that the binary is not up to date or will download a new. So the docker container version is not strongly coupled to a storagenode version

We want to update nodes in waves to do not shutdown the whole network when a new container is released. The first solution with our own forked watchtower with a random interval updates between 12 and 72 hours was not ideal. We wanted to use storagenode-updater, which follow the rolling update plan on https://version.storj.io, and is used on all other platforms.
Updating the container become a cumbersome task in this case. So we made a light base image, which contains only the downloader and the supervisor with its config. The supervisor run both processes - storagenode and storagenode-updater, storagenode-updater is updating the node and itself, when the NodeID is eligible to be updated right in the container. So the base image is rarely updated (only if it needs security OS patches) or when we wanted to fix the issue with downgrading of storagenode due to a container restart.

3 Likes

so is the storj version of watchtower truly useless now?

I’m still running it on some or all of my nodes, and would be happy to stop running it.

It’s used to update the base image. However, you may update the base image manually, but you may miss important security updates. So, I would still run it.

Thanks everyone for your suggestions.

To make @Alexey’s change possible while ensuring persistence between docker restarts, I created a new patch to add a new flag --binary-store-dir to the storagenode-updater.

The docker env equivalent of that flag will be BINARY_STORE_DIR and by default will be set to /app/config/bin. At entrypoint we will copy the binary to the /app/bin and execute it. This should resolve any permission issues.

4 Likes

I would suggest to merge the PR first

because it will heal already broken restricted systems, otherwise we would need to have one more release to be able to use a new version of storagenode-updater with this new flag. At this time we may lose these restricted systems due to offline.
I updated this PR to cover all edge cases I think, please review.

I prepared a separate PR to support this new option in the updater, but it can be merged only when this new feature will be 100% rolled out:

However, this new parameter has change nothing in the logic and it will remain the same as in !25 above:

  1. The script will download a minimal version, if the existing binary in the binary storage should be updated,
  2. It will check, is the node eligible to be updated to a suggested version and download it if so.

Removing this check would revert this change:

2 Likes

I just pushed new images containing this patch. Not marked as latest yet:
https://hub.docker.com/layers/storjlabs/storagenode/9109fd2/images/sha256-17c83b5a16a2938364e14e1a0cc62fabcd4a468bc3bab19fe31abe3c361e2a1b?context=repo

4 Likes

Asked to verify:

storjlabs/storagenode:9109fd2 is working fine on my raspberry pi2 (arm v5) storagenode. Software is run from /app/bin/ inside the docker container rather than /app/config/bin/ on the existing image.

1 Like