Nodes Offline!?

Like in everything. You have to understand the gist of what they say, not what they say… :smiley:

storj doesn’t use ipv6 so you can remove those… maybe they in some way interfere i duno… if nothing else they are unrequired, so just take up space…

i don’t get why all the nodes aren’t working tho when the first one is
you are sure its not a letter format issue, sometimes text can be different from what the cli reads due to formatting, so be sure you get the copy of the commands from a … whats it called…

lets just call it plain text… think it has like a 3 letter abbreviation with the number 20 in it.
but i cannot find that right now.

When I reset the firewall they all work. When I enable the firewall only 1 and 4 work. 2 and 3 do not work. Then I enable their ports again and they work. Then after some time all do not work. Then this, then that… Oh! See my point. Even I got confused.

No need. They do not interfere. STORJ may eventually start supporting IPv6 again, so I do a double setup.

Hmm… I think I managed to make it work after running all the iptables rules just now. Maybe I just make them not run at boot immediately but with sleep in a separate script, which runs at boot with cron. Will try that and let know. This could be because, if they execute before routing is done, some nodes are not considered established connections by the rules. :grinning: Another way around - will also possibly modify the Default-Start trigger of the firewall script, which is in init at boot, to see, if that helps.
# Default-Start: 2 3 4 5 . Maybe the runlevel 2 here is on the way because it is Multi-User Mode and does not configure network interfaces or start daemons.

i simply use this, it takes care of making it persistent
#sudo apt-get install iptables-persistent

and then i use this to update them.
iptables-save > /etc/iptables/rules.v4
iptables-save > /etc/iptables/rules.v6

but yeah running it in boot can cause issues, i forget why… but like you say it might be because it uses a system user rather than root / sudoer.

I do not want to make it persistent, unless I am sure it works.

Maybe this was it. I think I made it work, but I did so many things to do it and now I do not know why it works. :rofl:

and this is why usually its a good approach to change one thing at a time.
ofc that then makes stuff take longer, and if one doesn’t keep a log one might not remember what the solution was when needed a year or two later :smiley:

i actually started to keep a sort of changelog, or want to … for my server, so that i can reverse and remember what changes i made, if i run into problems

recently had a month with poor uptime because my docker storage driver was changed and the issue didn’t show up before 2-3 weeks later and i happily forgot that i ever changed it.
took way to many hours over a month to figure out what was going on.

1 Like

I know, I know, but with so many things to do… LOL. Now one of the nodes got screwed. Come on! WTF!? This should be a set and forget thing…

you don’t have some other firewall somewhere which might affect something…
some cloud hosts have dynamic firewalls with ddos protection which will turn on and off by themselves…

if iptables are set they shouldn’t change unless if you got other services affecting that kind of stuff, what distro are you using?

It is not cloud. It is bare metal. I do not have other firewalls, I think. LOL

It is for me :slight_smile: Has been this way for years now.

I think that simplicity is key in my case: no docker, no RAID, no HBA, no fancy networking, no static external IP. Just standard fare enduser grade stuff, except for enterprise class HDDs.

3 Likes

I agree, my HC2 has been running for almost 2 years now and I barely even check the dashboard nowadays, however I did spin up a few new nodes in proxmox and the did take a little more work to setup.
It makes sense that adding complexity to the system you need to figure more stuff out, at the end of the day it’s some peoples real job.

2 Likes

try change dashboard port you have same port to all nodes “14002:14002” change to 14003:14002, …

and you have domain “domain.com” ??

node3.domain.com:28969

dns not resolve node3.domain.com

image node3.domain.com

This does not help. I tried. I use DNS because there is also an AAAA record for IPv6 - just in case - a double set up. Works on other nodes. I think I should start the entire set up from scratch, even the OS and all…

What happened this afternoon when I checked on the nodes - they were all offline and returning the same error - ERROR contact:service ping satellite failed.

Great! Simply great! All 4 nodes on the machine got this


Support seems to be very helpful. Feel the irony? Now I have a reason to start from scratch.

Suspension isn’t permanent and can be recovered from, although if these are new nodes it would probably be easiest to start over. You say support isn’t being helpful. Did you file a support ticket?

https://support.storj.io/hc/en-us/requests/new

This project is supposed to be, and can be, set and forget. I have been running one node for 2 years and another for 3 and I barely need to look at it. Unfortunately like @twl and @TheMightyGreek said, adding complexity to the system gives the possibility for problems. And the setup you are trying to achieve isn’t exactly a typical user setup.

Although on the bright side, I think if all of your nodes got the suspension warning, it means all of your nodes should have been able to check in with the satellites. Are some of them still showing offline?

I would like to point out that what it seems you need help with is your own network setup and firewall. While you will probably still get some help for this, it is entirely outside of the scope of the Storj software itself. And of course your specific help is highly setup dependent. I would love to help you out as well, but I have little experience with iptables at all an thus have been quiet so far. Please keep in mind that Storj Labs is not responsible for complications with a specific setup you chose.

For me it really was pretty much set it and forget it. Simple case of port forward in my router and firewall exception on my NAS. I hope you figure it out though.

2 Likes

I would love to help you out as well, but I have no idea what you are trying to achieve.
My first suggestion would be to get ONE node running and after that add the second, the third, the fourth to reduce complexity and make error tracing easier.

1 Like

Yes. New nodes. Each had like 3 GB of data. Starting over…

No complexity at all:
1 Ubuntu server, 1 network card with 4 ports, 4 different /24 subnets with 1 usable IP, routing done, firewall set, ports open, docker installed, node identities generated, nodes started… FAIL. :smiley:

Showing online, but satellites could not connect. The reason for the disqualification is most probably being online/offline like crazy all the time trying to fix them all at once.

Thanks! Appreciated!

Exactly what I am doing… Will re-try… The only thing I did at the same time was calculating the identities at once in 4 ssh sessions separately. Then when they were all done, I started bringing each node up. One at a time. Will re-try. Hope this time it works. I won’t reinstall the system. It is working perfectly fine, I guess. If this fails again. I will try to reinstall the OS, too.