I have fast internet. My hardware is my bottleneck. Any ideas how to [try to] improve disk write speed, in terms of Linux / storj settings? Uploads loose race, small percentage are successful, the rest is “upload failed”
what hardware is it? defragmentation? how full is the disk? what says the system ressources scanner from linux? smr drive? USB drive? whats the ingress per day?
First of all is formatting it with the right file system: EXT4, ZFS or NTFS (especially no FAT, exFAT or BTRFS).
Second: database not on your data drive, but on your system drive. So the many IOPS of the database don’t compete with your data.
The remainder is only of little value, like:
Furthermore, it’s acceptance. As long as your node is growing some GB’s a day on average, I wouldn’t bother. Even my worst node on SMR was growing 5GB a day on average. The biggest concern is and was the fact there was a big difference between disk use and use according to Storj-network, probably reflecting the slow filewalker process.
made a node yesterday1.80.1, the default is 4Mib now.
This is unlikely related to a slow hardware, because storagenode can be run even on the router:
The usage depends on the customers, not your hardware, unless you use network filesystems (NFS, SMB/CIFS, SSHFS, etc.) or SMR drives.
So what would be your suggestion? Why my node started failing on most of the uploads?
P.S. at moments when CPU iowait time drops, I see mote successful uploads
Keep your node online, of course
Then probably your disk is not able to keep up. There is not much you can do about it, except using a native filesystem for your OS - ext4 for Linux, NTFS for Windows.
If you have more RAM, it usually starting to work better, because OS would use most of the RAM for cache.
If you have UPS and you use Windows, you may enable a write cache. This could help to increase number of successful uploads. But if you do not have UPS, enabling write cache would lead to data loss in case of power interruption.
How about that suggestion of moving DB to other place? I understand it is additional point of failure…
You should enable TCP-Fastopen if you didn’t already. I see so many lost races since TCP-Fastopen is been enabled on others, because my setup dosen’t support it (Synology NAS).
Lost races are a combination of slow hardware and high network latency between clients and your node.
This is not needed, unless you see a lot of errors like “database is locked”
You need to diagnose your specific bottleneck and remove it. Can you paste the output of iostat
as described here?
not sure if it really worked, but it seems it’s a little bit better after changing scheduler from deadline to noop
Out of curiosity, is your setup capable of doing NCQ (native command queueing, a SATA subprotocol—would need support from both the controller and the drives)? It’s somewhat known that NCQ and the deadline scheduler don’t mesh well.
Since OP does not provide the OS kernel,hardware setup or disk type, how should these “ancient texts” help?
I’m not sure why do you ask me
The greatly improvement in usage can be only from the customers side. All other is fine tune and remove bottlenecks, if you have them.
Somehow i missclicked
I wanted to say that moving dbs is indeed not neccesary but it helps a little bit with load reducing.
Sorry, what do you mean?
this NCQ is about 13 Y old
and @binary does not tell us his setting, or hardware…
…i can pracitcaly say: just make a new node in the same /24 ip to get better performance.
Yeah, and it is still relevant.
LVM nvme caching.
There is also a lot of RTT you can trim from your networking stack also.
One of the easiest ways, is 1 node 1 disk, and if the node fails, attempt to move before failure, and just run a bunch of smaller nodes.