Watchtower spawning copies of itself

Maybe, I nothing changed in the watchtower command and it’s running for 22hours already without issues.

maaaaaan. I can’t get my docker to clean up. docker kill and docker rm -f don’t work because once the service is started it’s starving the CPU for threads.

@Alexey So do we have an official answer about what we should do about duplicate watchtowers?

I have duplicate in each node but not dozens like others.
Do I stop and remove the oldest one which has an odd name and keep the one with storagenode in name?

Just kill them all and run the new one with this guide: https://documentation.storj.io/setup/software-updates#automatic-updates

2 Likes

I also had two nodes totally taken out by this. The nodes were so overrun by watchtower instances that they stopped responding. Docker also crashes so running docker commands resulted in failure to open the docker unit socket.

I was able to recover them manually but it took a long time and my uptime rep is pretty much destroyed for the time being. I also had to yank power a few times so I’m hoping I don’t have corrupted data at this point…

To resolve, I had to:

  • Reboot the node
  • cd to my /var/lib/docker/containers directory
  • grep for “watchtower” in each container subdir in order to find all the container IDs
  • create a “docker stop” command with every container ID as an argument
  • Reboot the node again to get Docker running (attempts at restarting the service hung the box)
  • As soon as containers start deploying, run the “docker stop” command with all the container IDs from above
  • Repeat the process until I got lucky enough to have contains begin stopping before my system ran out of resources and crashed docker again

I’m not sure I trust the use of storj watchtower any longer…

How do i kill/stop this?

@Alexey my issue nothing to do with Synology

@Alexey if you can’t say anything that’s can help - don’t say anything.
This your post is useless. Everyone knows where to find a original documentation and everyone knows, that there is no instructions for killing these processes. As well everyone understands that all theses processess should be killed.

So why you spam???

There is nothing useful in your post. No information on how to kill thousands of processes, but thank you for the link, which does not help also.

Why don’t you read a few more messages back…

1 Like

Because, i’ve tried this already.
And my issue is NOT on synology.

“docker ps | grep watchtower | awk ‘{print “docker rm --force”, $1}’ | bash
awk: cmd. line:1: ‘{print
awk: cmd. line:1: ^ invalid char ‘?’ in expression”
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

This is happening on all systems not only Synology. So we trying to find a good solution for this issue.
Your system is affected by this issue too.
Please, replace all curly quotes to the straight ones. Or copy the command from here:

sudo service docker stop
sudo service docker start
docker ps | grep watchtower | awk '{hs = hs" "$1} END{print "docker rm --force", hs}' | bash

If you uses sudo to run the container, then add the sudo before docker
if the first command throw an error, then these ones

sudo service snap.docker.dockerd stop
sudo service snap.docker.dockerd start
docker ps -a | grep watchtower | awk '{hs = hs" "$1} END{print "docker rm --force", hs}' | bash

Perhaps you should try it a few times until succeed.

2 Likes

I can confirm that this works on Ubuntu.

Then post your problem like you just did and don’t get unfriendly with people here. They are definitely not trying to spam but to help.

2 Likes

It might be a bug in watchtower 0.3.9. We have pushed 0.3.8. I hope that will stop it for now.

2 Likes

I’m sorry, but….

I seen lot’s of people, including my self, loosing nodes because of docker.
Most of the problems i had - was with docker. And now - docker again.
I can’t understand anymore is this project for fun or somebody is trying to make a real thing? If it’s a joke, then it’s waisted too many time. If this is not a joke, why the hell Storj is still using “docker”?

i owned 13 nodes, 5 of them died because of the docker issues.
My nodes are dedicated linux servers that run on fast fibre optic internet (slowest is on 300Mbps), currently in 8 different geographical locations around the country, with RAID’s, UPS’es, 24/7 monitoring and etc. And I was about to start two more, one in USA other in Scandinavia.

Today i’m thinking maybe it would be the best to switch them off at all. As this is not funny anymore. I can run 12 servers for fun, i can spend my money on electricity for fun, and i can spend some of my time for fun. I can fix something when it brakes, can fix tree, four, nine times for fun. I even can understand that storj can’t paid right now at least the amount, that covers electricity. All this is acceptable for me.

But i can’t understand why do you use this crashing, glitch’ing docker with so many problems and un-stability… People are loosing nodes, loosing reputation, loosing time and money, and now i’m beginning to lose hope for this project as well…

As i have a lot of up and running proper hardware, couple days ago i have started to read about SiaCoin, FileCoin, MaidSAFE ant etc. Not because i would like to (i even does not have time for it), but because painful to see my self putting effort in tho this, putting my soul, my time, money, effort trying to make world better, decentralised and regularly coming BACK to never ending problems with docker….

I believe that those SNO, who just have one workstation, one hdd, one NODE, they probably don’t care if something will go wrong. But when you have ~10 of them, set up as proper as possible, maintained as best as possible and regularly loose them because of…. docker…. come on…

My node encountered problems yesterday. At this time it’s off line for nearly 24 hours. and i don’t care will it live or will it die… I can’t quit my main job and spend all my time, dealing with docker issues again and again and again….

1 Like

Well it is still in “BETA”, what do you expect?

The whole point of a Alpha/Beta is to find, and work out, as many of the bugs/kinks in the system before it goes into “production”. You knew that when you signed up.

3 Likes

That’s all OK and all understandable.

BUT if this is Alpha/Beta - DO NOT kill nodes. This is not OUR fault…

You can send an email to support and see if they can do something.

Actually i have received e-mail from support by them self. They where asking what happened to node and how they can help?

So they switch off my node, because of glitching docker and then send me e-mail asking how can we help you?! :expressionless:

The switching off on nodes is probably an automatic thing.
And it looks like turning them back on is a manual one at present.

Sorry I don’t get the problem. Why don’t you just fix the problem and start your storage node again? Disqualification should not happen in the current situation. Let me know if it does and I will fix it for you. If you managed to corrupt your database we should be able to fix that as well. So no reason to rage quit.

Yes it is Alpha/Beta and that usually means that sometimes manual steps are needed to fix a problem. That is the only impact the current situation should have.

6 Likes