Watchtower spawning copies of itself

grich · October 3, 2019, 1:30am

Sometime earlier today, watchtower went crazy on my Synology based node and started spawning copies of itself. Anyone else see this or have any idea why.

KernelPanick · October 3, 2019, 2:18am

I see that you are using :latest, be sure to use :beta (or :alpha) right now. I’m not sure that’s the reason you’re spawning more though.

ignore this ^ thought it was storagenode, not watchtower.

nerdatwork · October 3, 2019, 4:39am

Are you using the command given in the documentation ?

docker run -d --restart=always --name watchtower -v /var/run/docker.sock:/var/run/docker.sock storjlabs/watchtower storagenode watchtower --stop-timeout 300s --interval 21600

Reference: https://documentation.storj.io/setup/software-updates#automatic-updates

Odmin · October 3, 2019, 4:53am

The same situation on linux:

Up 10 days - old watchtower
Up 10 hours - first watchtower update from yesterday
Up 4 hours - second watchtower update from yesterday

Let’s look into old watchtower log (I have --debug):

2019-10-03T00:15:33.293898906Z time="2019-10-03T00:15:33Z" level=debug msg="No credentials for storjlabs in /config.json"
2019-10-03T00:15:33.293909881Z time="2019-10-03T00:15:33Z" level=debug msg="Got auth value: "
2019-10-03T00:15:33.293913064Z time="2019-10-03T00:15:33Z" level=debug msg="Got image name: storjlabs/storagenode:beta"
2019-10-03T00:15:33.293915402Z time="2019-10-03T00:15:33Z" level=debug msg="No authentication credentials found for storjlabs/storagenode:beta"
2019-10-03T00:15:35.650556735Z time="2019-10-03T00:15:35Z" level=debug msg="No new images found for /storagenode"
2019-10-03T00:15:35.650567248Z time="2019-10-03T00:15:35Z" level=debug msg="Pulling storjlabs/watchtower:latest for /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx"
2019-10-03T00:15:35.650613989Z time="2019-10-03T00:15:35Z" level=debug msg="No credentials for storjlabs in /config.json"
2019-10-03T00:15:35.650618425Z time="2019-10-03T00:15:35Z" level=debug msg="Got auth value: "
2019-10-03T00:15:35.650620712Z time="2019-10-03T00:15:35Z" level=debug msg="Got image name: storjlabs/watchtower:latest"
2019-10-03T00:15:35.650622879Z time="2019-10-03T00:15:35Z" level=debug msg="No authentication credentials found for storjlabs/watchtower:latest"
2019-10-03T00:15:37.868269009Z time="2019-10-03T00:15:37Z" level=info msg="Found new storjlabs/watchtower:latest image (sha256:7b387a44c7a1435c85f4a3c68cf9c25fc26f7145471c0801b27bb5bc939ca577)"
2019-10-03T00:15:37.868286056Z time="2019-10-03T00:15:37Z" level=debug msg="This is the watchtower container /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx"
2019-10-03T00:15:37.868289061Z time="2019-10-03T00:15:37Z" level=debug msg="Renaming container /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx (fea0fda1774ce07e2777204b26d254f4898361e57ecdc73e2a63620a17ed9260) to UEUxcJyvjkQxDNMwwdVKyOMrBkDAabWm"
2019-10-03T00:15:37.885885174Z time="2019-10-03T00:15:37Z" level=info msg="Creating /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx"
2019-10-03T00:15:37.977121546Z time="2019-10-03T00:15:37Z" level=debug msg="Starting container /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx (6f3d798880e6974f7830ec58d412d59ad77d671ed33a426a017bd147098bf165)"
2019-10-03T00:15:38.293756698Z time="2019-10-03T00:15:38Z" level=debug msg="Scheduled next run: 2019-10-03 06:15:31 +0000 UTC"

And look into first yesterday update:

2019-10-03T00:15:33.293913064Z time="2019-10-03T00:15:33Z" level=debug msg="Got image name: storjlabs/storagenode:beta"
2019-10-03T00:15:33.293915402Z time="2019-10-03T00:15:33Z" level=debug msg="No authentication credentials found for storjlabs/storagenode:beta"
2019-10-03T00:15:35.650556735Z time="2019-10-03T00:15:35Z" level=debug msg="No new images found for /storagenode"
2019-10-03T00:15:35.650567248Z time="2019-10-03T00:15:35Z" level=debug msg="Pulling storjlabs/watchtower:latest for /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx"
2019-10-03T00:15:35.650613989Z time="2019-10-03T00:15:35Z" level=debug msg="No credentials for storjlabs in /config.json"
2019-10-03T00:15:35.650618425Z time="2019-10-03T00:15:35Z" level=debug msg="Got auth value: "
2019-10-03T00:15:35.650620712Z time="2019-10-03T00:15:35Z" level=debug msg="Got image name: storjlabs/watchtower:latest"
2019-10-03T00:15:35.650622879Z time="2019-10-03T00:15:35Z" level=debug msg="No authentication credentials found for storjlabs/watchtower:latest"
2019-10-03T00:15:37.868269009Z time="2019-10-03T00:15:37Z" level=info msg="Found new storjlabs/watchtower:latest image (sha256:7b387a44c7a1435c85f4a3c68cf9c25fc26f7145471c0801b27bb5bc939ca577)"
2019-10-03T00:15:37.868286056Z time="2019-10-03T00:15:37Z" level=debug msg="This is the watchtower container /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx"
2019-10-03T00:15:37.868289061Z time="2019-10-03T00:15:37Z" level=debug msg="Renaming container /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx (fea0fda1774ce07e2777204b26d254f4898361e57ecdc73e2a63620a17ed9260) to UEUxcJyvjkQxDNMwwdVKyOMrBkDAabWm"
2019-10-03T00:15:37.885885174Z time="2019-10-03T00:15:37Z" level=info msg="Creating /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx"
2019-10-03T00:15:37.977121546Z time="2019-10-03T00:15:37Z" level=debug msg="Starting container /ufSyowEapwZWPDhXDbbmJjDsMUpTKLLx (6f3d798880e6974f7830ec58d412d59ad77d671ed33a426a017bd147098bf165)"
2019-10-03T00:15:38.293756698Z time="2019-10-03T00:15:38Z" level=debug msg="Scheduled next run: 2019-10-03 06:15:31 +0000 UTC"

So, watchtower not stopping old watchtower container after complete update himself (not other containers).

Odmin · October 3, 2019, 5:05am

Also I looked into watchtower documentation and see we are loose important argument --cleanup

Cleanup¶

Removes old images after updating. When this flag is specified, watchtower will remove the old image after restarting a container with a new image. Use this option to prevent the accumulation of orphaned images on your system as containers are updated.

I propose adding this argument to the our documentation:
docker run -d --restart=always --name watchtower -v /var/run/docker.sock:/var/run/docker.sock storjlabs/watchtower storagenode watchtower --stop-timeout 300s --interval 21600 --cleanup

kevink · October 3, 2019, 5:33am

That sounds helpful. I also noticed a 2nd watchtower instance but removed it before it could “replicate” itself.

tankmann · October 3, 2019, 11:04am

I have the same here…

CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS              PORTS                                                NAMES
e4fc4e9562b8        storjlabs/watchtower:latest   "/watchtower storage…"   16 hours ago        Up 16 hours                                                              watchtower
e8846ec3da5a        storjlabs/storagenode:beta    "/entrypoint"            6 days ago          Up 6 days           0.0.0.0:14002->14002/tcp, 0.0.0.0:28967->28967/tcp   storagenode
6ca2e4ebbeb2        84803293c0e3                  "/watchtower storage…"   10 days ago         Up 6 days                                                                QxlouVbDbZMtJrIapHRiKVstEFsXHZaU

soothill · October 3, 2019, 11:16am

If you do have 100’s of watchtower processes running then this will help stop them

docker ps -a | awk ‘{ print $1,$2 }’ | grep watchtower:latest | awk ‘{print $1 }’ | xargs -I {} docker stop {}

Change the stop to an rm and you can then remove them all.

KernelPanick · October 3, 2019, 11:30am

Mine looks like tankmann’s. Been up 20 hours now. Something glitched out.

S0litiare · October 3, 2019, 12:00pm

My home box has crawled to a stand still with this bug!

It’s taking an age to stop all these containers, even using the script listed above!

Must be an easier way to kill all these containers and stop them from restarting…

Things like this is why I’ve never been a huge fan of “docker” and containers in general!
They are fine, up to the point they run amok then they are a total disaster to try and get back under control.

tankmann · October 3, 2019, 12:15pm

I now stopped the old one, the newly spawned one and deleted both.
Started then new with this command as per above:

sudo docker run -d --restart=always --name watchtower -v /var/run/docker.sock:/var/run/docker.sock storjlabs/watchtower storagenode watchtower --stop-timeout 300s --interval 21600 --cleanup

Now it’s looking good again.

What I also realized is that the docker process itself captured the logs but when I did the
sudo docker logs --tail 50 storagenode

it stopped showing something and the time is the same than the second watchtower process started… Also docker process took 4GB RAM… so I stopped it, restarted it … now it’s good.

S0litiare · October 3, 2019, 12:44pm

suggested edit to “Watchtower” command :
change:
‘–restart=always’
to
‘–restart=unless-stopped’

Would that help with this issue?

With my slow system it was stopping one container just as it was restarting a previously stopped container, (one crashed during restart and it caused the “stop” script above to hang.) and it took till i ran a command to change all the dockers to “–restart=unless-stopped” a few times with reboots to kill all the rogue containers.

kevink · October 3, 2019, 1:21pm

We already got the correct answer to solving this issue:

sudo docker run -d --restart=always --name watchtower -v /var/run/docker.sock:/var/run/docker.sock storjlabs/watchtower storagenode watchtower --stop-timeout 300s --interval 21600 --cleanup

It’s the --cleanup

tedder · October 3, 2019, 8:01pm

I have 382 copies of watchtower running ^-^ Load average is above 500. (edit- woo, over 1000 load avg now!)

Definitely will be adding --cleanup.

Alexey · October 3, 2019, 8:26pm

This flag has nothing to do with multi self copying. It will cleanup a previous versions of images not the running containers.

I got the same behavior and used a similar script today

docker ps | grep watchtower | awk '{print "docker rm --force", $1}' | bash

However, I think it’s a glitch with a watchtower itself which was fixed recently.

Odmin · October 3, 2019, 8:38pm

Exactly, it not related to “watchtower multicopy” (I split it it for two messages), but I think it also is important because no reason keep old images on the system.

tedder · October 3, 2019, 8:40pm

Where do you see the fix? The upstream watchtower was last updated 18 days ago, though there was a patch to the storj watchtower, maybe renaming it caused the explosions?

Odmin · October 3, 2019, 8:41pm

This issue related only when watchtower update itself (when watchtower update storagenode - no any issues)

voltage · October 3, 2019, 7:41pm

What is going on?

_watch

nerdatwork · October 3, 2019, 8:07pm

Check this post