Nodes goes offline randomly and greetings

Good evening,
I’m Giak and i’m from Italy.If there is a dedicated topic for greetings i’d be glad to write there.

My problem is that my node randomly goes offline, so i need to restart it to make it working again.
It happens about every 100 hours of node uptime (i run it on raspberry pi4).
I’m using a VPN 'cause my isp uses nat so i couldn’t be reachable without it.
I’m also using a ddns 'cause i don’t have a static ip.
Port forwarding works correctly, is there anything else i could check?

Thanks

1 Like

Welcome to the forum @Giak!

How is your HDD connected ?

Hello!
Is connected via usb 3.0, is a 2 TB Toshiba mechanichal hard drive (i have 1.5 TB dedicated for storj )

Can you tell me its model name/number ?

This is in regards to :arrow_down:

Sorry is a WD Elements model : WDBU6Y0020BBK-WESN

Western Digital had said their WD Red HDD to WD Black HDDs use SMR so I think WD elements is not one of them.

What does your log show when it goes offline ?

https://documentation.storj.io/resources/faq/check-logs

Next time it goes offline (I hope not :D) i’ll let you know

This is why it’s advisable to redirect your logs.
https://documentation.storj.io/resources/faq/redirect-logs

is your USB HDD mounted statically with FSTAB ?

@buchette yes, i confirm
@nerdatwork i tried to edit the row like :

log.output: “/home/pi/node.log” but when i save it the dashboard gui stop working and the docker keep
restarting.

Probably my row about log is wrong but i don’t find the error (i commented with # again and now everything works).

That’s wrong. Do exactly as shown in the link and the row should look like

log.output: “/app/config/node.log”

Then restart the node. It will create node.log

Oh i thought i could change the path i wish.
However now i enabled the log and everything works correctly.
As i get more information i’ll let you know, thanks !

1 Like

Good morning, i just update my node to V. 1.12.13, it worked well for a few minutes and now i see these errors :

2020-09-26T07:38:04.218Z ERROR orders listing orders {“error”: “order: unexpected EOF”, “errorVerbose”: “order: unexpected EOF\n\tstorj.io/storj/storagenode/orders.readOrder:555\n\tstorj.io/storj/storagenode/orders.(*FileStore).ListUnsentBySatellite.func1:245\n\tpath/filepath.walk:360\n\tpath/filepath.walk:384\n\tpath/filepath.Walk:406\n\tstorj.io/storj/storagenode/orders.(*FileStore).ListUnsentBySatellite:196\n\tstorj.io/storj/storagenode/orders.(*Service).sendOrdersFromFileStore:398\n\tstorj.io/storj/storagenode/orders.(*Service).SendOrders:192\n\tstorj.io/storj/storagenode/orders.(*Service).Run.func1:139\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-09-26T07:38:13.426Z ERROR contact:service ping satellite failed {“Satellite ID”: “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”, “attempts”: 8, “error”: “ping satellite error: rpccompat: dial tcp 78.94.240.189:7777: connect: connection refused”, “errorVerbose”: “ping satellite error: rpccompat: dial tcp 78.94.240.189:7777: connect: connection refused\n\tstorj.io/common/rpc.Dialer.dialTransport:211\n\tstorj.io/common/rpc.Dialer.dial:188\n\tstorj.io/common/rpc.Dialer.DialNodeURL:148\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

I haven’t had contact for 30 minutes, node is still online

Ignore error for 118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW as this satellite has shut down.

Go to your folder for orders/unsent remove the oldest 5 files that would have nearly same timestamp. These 5 files represent orders for 5 satelllites. You can save them in another folder too if you want. Then restart your node. Observe the log to make sure you see sending for orders which will prove your issue was fixed.

Here is a reference thread:

Great everything is fixed, thanks

1 Like

Hello,
My node went offline (now i connected again to vpn, restarted node and it’s online again).
Last message i see from log is :

2020-09-27T12:20:49.800Z INFO contact:service retries timed out for this cycle

and some ping to satellites that fails

This is probably related to the shutdown stefan-benten satellite, which you can confirm by checking for the satellite ID: 118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW . You can ignore errors related to this satellite. So your problem must be something else. I would double check that your DDNS updater is working properly. If you use no-ip, the account must be verified manually once per month I believe.

In addition to ^ you can check this thread too

@baker yes it’s that satellite, infact i’m ignoring those errors.
@nerdatwork i took a look, my problem is that my ip is under nat so without a ddns i can’t use my node.
Next time i try to run a new node directly with vpn server’s ip (even if it is dynamic) but until my connection is up, it doesn’t change so i give a try.

I Found the problem, DUC wasn’t working well, so as my vpn server changed ip address, the connection stopped working.
I made some tests and DDNS always point to the right IP now.
You can close this thread.

1 Like