Node is erroring and restarting

Hello all, I just noticed my node didn’t display any stats on the dashboard so I checked the logs and everything was going fine several uploads and downloads going on and no errors, with an uptime since the last update (600-something hours)

So I restarted the node and now it keeps restarting every minute or so, after these error messages appear.

2020-06-28T18:03:31.129Z	ERROR	piecestore:cache	error getting current space used calculation: 		{"error": "lstat config/storage/blobs/ukfu6tapb5d2r2tsmjbhbboxilvt7jrwlqk7y2tapb5d2r2tsmjtapb5d2r2tsmj2sjxvqaaaaaa/3a/jeq5drdxuc7yzsvaccvvc3qsldz7ccyp2alheykjt7chv3iraq.sj1: input/output error"}
2020-06-28T18:03:31.130Z	ERROR	nodestats:cache		Get disk space usage query failed			        {"error": "context canceled"},
2020-06-28T18:03:31.130Z	ERROR	nodestats:cache		Get held amount query failed				        {"error": "context canceled"},
Error: lstat config/storage/blobs/ukfu6tapb5d2r2tsmjbhbboxilvt7jrwlqk7y2tapb5d2r2tsmjtapb5d2r2tsmj2sjxvqaaaaaa/3a/jeq5drdxuc7yzsvaccvvc3qsldz7ccyp2alheykjt7chv3iraq.sj1: input/output error

Does anyone know what these errors means and how to resolve them?

thanks :slight_smile:

Hello @Jeeaaasus,
Welcome to the forum!

The reason is

input/output error

Please, stop and remove your storagenode container and check your disk for errors
You can start from making sure that it’s still connected

df --si

Hello @Alexey :slight_smile:

Yeah the disk is mounted and seems fine, I can write files to it. It’s only 55% full, also in the minute before the node restarts, it does get some uploads and downloads which all finish with no errors.

Could this be maybe a corrupted file? what is that file it’s referencing in the error message?

config/storage/blobs/ukfu6tapb5d2r2tsmjbhbboxilvt7jrwlqk7y2tapb5d2r2tsmjtapb5d2r2tsmj2sjxvqaaaaaa/3a/jeq5drdxuc7yzsvaccvvc3qsldz7ccyp2alheykjt7chv3iraq.sj1

Just an ordinary piece?

Yes, it’s one of the pieces.
I would like recommend to check your disk for errors. This corruption could be a beginning of dying of the disk.

Any specific tool you’d recommend?

6 files got fixed and the node no longer errors or restarts.

But it took 30 minutes before the stats showed up on the dashboard but maybe that’s as expected …

Thanks @Alexey :slight_smile:

1 Like