i must not have waited long enough…
don’t have the nodes on restart unless stopped, so i ran the docker run command and it terminated because it couldn’t talk to the updater according to the storagenode logs.
so shortly after maybe 30 sec or a minute i ran the docker start storagenode command and it again ran for a brief moment and then stopped again so i went back to latest.
today when i tried to start the node using the “docker test image”
it started up without a hitch… i simply must not have waited long enough.
so all okay from my side… tho it might be smart if the storagenode doesn’t stop because the updater isn’t done downloading yet… but that seems a bit like a minor oversight that really doesn’t change much and which will eventually be fixed…
tried to update another node in a different container, and it didn’t behave like the first one…
but it did pull the image again ofc, so maybe the image was changed… or my zfs did some caching magic which made the updater instantly update…
It looks to me like both the storagenode and storagenode-updater get killed at the same time at some point and then the container stops running because nothing is running inside it anymore. I’m not sure if that’s by design, but if so that means it really relies on the --restart unless-stopped option to successfully complete updates. And containers without it may just stop instead.
i’ve had --restart unless-stopped off for a while…
had some now resolved issues which nearly took down my system, not sure how easy that would have been without the option to manually start stuff.
ofc i could just control that over my proxmox containers instead of docker… so meh.
mine started now and no sign of the issue…
will leave it and mine only shutdown because the storagenode sent a termination signal…
but yeah i duno if that would be sent when the storagenode binary exits… i suppose its possible, i guess they already updated the version…
wonder if they would progress the version, if they changed the storagenode binaries inside the image… i mean they could in theory already have fixed it without us knowing it…
if the entire update cycle happens within the binary or whatever
anyways… my setup seems to run the new version fine now…
not sure if something changed or not… or if its just zfs being smart
i guess i better add the --restart unless-stopped function back to my docker run commands.
when i update after 5f2777af9-v1.50.4-go1.17.5 becomes latest.
just to be on the safe side for now, don’t really have any alerts running for if it goes down lol
It would be nice if some of those INFO lines regarding the updater, which I am guessing are only visible with log level set to debug, would make it to the normal log level. I didn’t see anything in the normal logs explaining that the updater was running. Or perhaps some more appropriate log messages like “update found”, “starting update”, etc.
Nah, that’s not it. The log that is redirected to a file only contains the log of the storagenode process. The rest, which you are seeing here, is the docker container log. This used to be empty if you redirected logs to a file, but now contains the log from the entrypoint script in the container, supervisor and storagenode-updater processes.
I’m pulling it from the synology interface here, but you can also see this log by using docker logs --tail 30 storagenode or whatever your container name is.
If I don’t miss something the issue that I found on the Raspberry PI with the libseccomp2 package affects all the SNO’s that run on Raspberry OS, I have to manualy update the package to let the updater works.
How do you deal with that without invite all the affected SNO’s to do the same?
I think these nodes can be skipped until there is an image replacement for arm32 nodes.
Thankfully to your report we would give a support to node operators with arm32 nodes how to update them to continue work normally.
I do not have exact plan on my hands, but perhaps the next release could have a fix for arm32 issue.
No image for my Raspberry running in 64 Bit mode available?
pi@pi-hole:~/storj/nodes $ uname -a
Linux pi-hole.discworld.intern 5.10.103-v8+ #1529 SMP PREEMPT Tue Mar 8 12:26:46 GMT 2022 aarch64 GNU/Linux
pi@pi-hole:~/storj/nodes $ uname -m
pi@pi-hole:~/storj/nodes $ docker pull storjlabs/storagenode:latest
latest: Pulling from storjlabs/storagenode
no matching manifest for linux/arm/v7 in the manifest list entries
Ahh, good to see the v6 image works. That gives you some breathing room until v7 is fixed for the latest image. This one will probably work for a long time now anyway, since the updates will just happen inside the container. So there isn’t a real rush to go back to latest.
The storage node rollout is finished. The docker images have been pushed as latest except arm32. The arm32 nodes will stay on the old version for a little bit longer. We are working on a fix for arm32: https://review.dev.storj.io/c/storj/storj/+/7131
Thank you everyone. With your help we have been able to identify this issue without crashing too many production nodes. Last but not least don’t forget to switch back to storjlabs/storagenode:latest