Fatal Error on my Node

Vadim · April 3, 2023, 8:08pm

also does node DBs on data disk on on ssd, you may consider to move it to ssd this will helps a lot.

Vadim · April 3, 2023, 8:12pm

May be i dont have this problems, because all my DBs on SSD, so only data are going in and out of disk. Also i usualy have around 1 GB of RAM per Nodes or even more. Windows started to lagging at some point with only 8 GB and when i added more 8 or more it worked much faster.

pcresumen · April 3, 2023, 8:19pm

The 6TB drive is:

BRAND: WD
TYPE: SATA/600
WDC MODEL WD6003FRYZ-01F0DB0 6tb
SPEED: 7200RPM

Vadim · April 3, 2023, 8:23pm

so it is CMR disk it is good, what about databases are they on hdd?
if it possible try to add more ram

pcresumen · April 3, 2023, 8:30pm

The databases are on the SSD to speed up performance.

Right now I don’t have any more ram memory sticks to add to it.

Vadim · April 3, 2023, 8:31pm

what motherboard model is, is it connected to motherboard or additional card?

Toyoo · April 3, 2023, 9:02pm

There were ideas like that already, e.g. help a lot on Windows, with careful setup. Less so on Linux—Windows has much bigger overhead for file system operations (edit: better link here).

Though, it should be possible to do even better than that. At some point as part of my experiments I was testing an approach based on just append-only blob stores, where deletes were handled by punching holes (potentially with FALLOC_FL_COLLAPSE_RANGE, though that would be just cosmetics). It was unbelievably fast, like ~5-10 times faster than regular ext4 operations. It’d take a lot of effort to code it and make sure it doesn’t lose data, it would work only on modern Linux, and only on some specific file systems. So, plenty of caveats for something that is not necessary now.

pcresumen · April 3, 2023, 9:15pm

MB: CROSSHAIR V FORMULA-Z
CPU: AMD FX 9590
Currently RAM 8 GB max 32GB (formerly I had 32GB, but I took advantage of it for the other nodes)
RAM speed: KINGSTON 1833MHz
SATA port capacity: 8
HDD of the node connected via USB 3.0 and E-SATA in a bay (attached link)

https://www.amazon.es/Fantec-QB-35US3R-pulgadas-supervelocidad-ventilador/dp/B003Q72F52

Formerly it was in another simpler bay of a single 3.5-inch disk.

In the future I want to put the 32GB of RAM back on the motherboard and then put 7 extra internal disks and each disk was a Node

If you ask me if I have tried other sticks, if I have, I exchanged 2 and nothing… the problem persists.

Already posts I indicate another node that I have and that is working correctly without problems and has worse hardware (older hardware):

MB: ASUS EXTREME STRIKER II
CPU: Intel Core Quad 9650
RAM: 4 sticks of 2GB of memory (8GB) DDR3 1333 MHz
SATA ports: 6 SATA II
IDE ports: 2
OS HDD: WD2002FAEX-00MJRA0 2TB
Node HDD: WDC MODEL WD6003FRYZ-01F0DB0 6TB

Why doesn’t it give me problems in a node with worse hardware and one with better hardware? These are things that one cannot understand…

Vadim · April 3, 2023, 9:21pm

why not use one of 8 sata ports, usb is a problem with lot of small files.

pcresumen · April 3, 2023, 9:24pm

Now it is connected via E-SATA, not USB. But tomorrow I can look to put it internal and test it

Vadim · April 3, 2023, 9:28pm

you can try to use that
Filestore.write-buffer-size: 2 MiB

in config, it will use more RAM, but will help make less small writes to hdd, default buffer is 128kb
all my nodes have this

pcresumen · April 3, 2023, 9:29pm

Ok I have already made the changes that you have told me in the configuration file. Let’s see what’s up.

Currently getting bored the memory is using 30% (2.4GB) of the 8GB

Vadim · April 3, 2023, 9:51pm

when file walker will end it will have more resources to downloaded, or you disabled file walker?

pcresumen · April 3, 2023, 10:34pm

I only disabled this since there is the problem:

storage2.piece-scan-on-startup: false

Disk usage time is not as intense when the service is restarted, it quickly drops to 5-20% usage but still crashes. At the moment the service has not been hung since I made the modifications that you have indicated. Let’s see how it evolves

Vicente · April 4, 2023, 12:28am

I think I have the same problem.

I have windows 11, node version v1.75.2

I have two nodes on the same pc. The largest node that has space available is the one that is giving problems.

I can’t open the file storageode.log the file is 19GB
Is there a way to open it to look at it?
I have renamed the file and created a new storageode.log file in case the node fails again.

The node stopped on 04/03/2023 23:04 and on 04/01/2023 5:48

The event viewer shows “The Storj V3 Storage Node service was terminated unexpectedly. This has happened 3 times.”

There are no errors on the disk.

Some satellites mark suspension. Look capture.

I have left the node with problems without space, to see if I have more problems of stopping.

jammerdan · April 4, 2023, 12:44am

Not surprising given ingress that is currently happening. However nodes should not die so easily.

snorkel · April 4, 2023, 1:47am

Big HDDs will pretty soon become dominant in the network. With all this ingress, the smaller ones if they are not full already, they will soon. Payouts are cut and energy still costs alot in most places, so it makes sense to buy bigger drives. Even if you have free energy, you will run out of ports sooner if you only buy small hdds. The prices/TB are very attractive for bigger HDDs now. So for me it’s clear that in max 2 years, the only nodes that have space unoccupied will be 16TB+.
So… Storj team must focus on impruving the dayly life of nodes, with big HDDs in mind. If nodes start daying because can’t handle the traffic, the traffic that they and we all wanted, the network will die in a few years for sure. This must be top priority!
All enterprise HDDs can handle 550TB/year (engress and ingress). That is -46TB/month, or 1.5TB/day. That is more than 10x what we have these days. The consumer HDDs can handle half of that, so 5x still. So the engress and ingress are not the problem now and maybe never, for HDDs. The problem is the other stuff that sroragenode software is doing with the drives, all those checks and veryfications and redundant stuff. I can agree with the reasons behind them, but it’s time to reconsider some and improve all. Or, if there aren’t any solutions, adjust the specs for recomanded/demanded setups for new nodes, to make people aware of what they should use from nowon.

Alexey · April 4, 2023, 4:26am

in this way you simply turn a blind eye to the existing problem with your equipment, and do not fix it.

Alexey · April 4, 2023, 4:37am

Yes, use a PowerShell: How do I check my logs? - Storj Docs and filters like -last 20 or pipe to sls like

Get-Content "C:\Program Files\Storj\Storage Node\storagenode.log" | sls "fatal error" | select -last 10

You may also stop the storagenode service either from the Services applet or from the elevated PowerShell:

Stop-Service storagenode

move "C:\Program Files\Storj\Storage Node\storagenode.log" to the archive and start the service back

Start-Service storagenode

Or setup a logrotate for Windows or use the PowerShell scripts like Native logs rotation in Windows with a simple PowerShell script to do not do it manually.

Vadim · April 4, 2023, 4:43am

At some point i started to use log.level: WARN it logs only warnings and errors.
logs just dont rise so fast any more, only when problems are there, so I decided that I dont care about normal usage logs. And reading logs started to be more simple, less time for searching errors.