However, considering what’s in it, I wouldn’t expect my disk to stall so miserably to the point of not being able to sustain Storj ingress, which by experience never goes above a few MB per sec.
Besides afaik, because Storj doesn’t store 32kB pieces (they are way bigger than that) it shouldn’t imply too many iops for disks.
My SMR disk is a 2TB 2"5 (Toshiba MQ04UBD200) though, so it may well be very different from the tested WD in the article.
I agree with this. While SMR drives are slow, they’re not as slow as the upstream speed of my Internet connection. As long as the iops are optimised, an SMR drive should be able to handle Storj. There were certainly some major caching improvements in 1.3.3, but then my SMR drive failed and I’m still waiting for my replacement to arrive to continue testing.
My drive still can’t keep up with the latest version (1.5.2).
I had to reconfigure it again this morning so it does not accept anymore data, because of sustained incoming ingress causing it to stall after a few tens of minutes or so.
you cannot make a SMR keep up with random read writes… have you tried moving the database to the ssd your OS is running from.
that should significantly improve the workload… else you need a big ssd cache to turn the random writes into sequential writes…
or you can do some kinda of tier storage solution like using microsoft storagespaces.
The thing is running on an Rpi, so moving databases to the sd card doesn’t seem to be a great idea as it would probably kill the sd card fairly quickly. Or I should put an additionnal storage device dedicated to that.
i think some sd cards are better suited for that than others… but not 100%
i believe its after they got hugely popular as Rpi OS drives then the manufactures started creating special higher iops cards for that use case…
ofc a regular old ssd might be better suited and cheaper…
Hopefully 1.6.3 will show an improvement on SMR HDD’s. That release will move used_serials to RAM instead of a db file. Since used serials and orders are as far as I know the only “per transfer” DB writes, this should effectively cut DB writes in half. Could be enough to fix the issue.
hmmm didn’t think about that… would be a nice easy fix…
but i don’t have much faith in SMR drives, they are just so terribly slow for random writes, nothing really fixes that… i mean 700kb/s
i can barely keep from laughing every time i think of that number…
it’s absurd…
Have you tried to tune a new write cache/aggregation option introduced in Storj v.1.4. ?
There was already write caching before, and after v.1.4 it is available for user tuning via config files or command line option. It was disused about month ago in the corresponding GitHub issue: https://github.com/storj/storj/issues/3854#issuecomment-624522307
But it looks like that nobody of the SNOs with SMR HDDs has yet been tested it. You can be the first one.
Default value for write buffer is 128KiB (per each data piece/upload). It is interesting to see if increasing it to say 1-2 MiB will help mitigate SMR performance issues. filestore.write-buffer-size: 1MiB
Not like this. You can pass any option to the storagenode inside the docker, if you put this option as a last option in your docker run command after the storagenode image:
It can consume significant amounts of RAM. Because this buffer is set per each datapiece /upload so RAM usage will be multiplied by number of all unfinished uploads. It is NOT a total size of a write cache.
And can be a vector of DoS attacks on the nodes (initiate a LOT of slow concurrent uploads to node and it will crash running out of RAM while tries to allocate RAM for all incoming uploads) .
And it does not make any sense in setting it above 3 MiB (or 4 MiB - if it should be in powers of 2 - i am not sure if arbitrary values are allowed). As current max size of a data pieces is about 2.2 MiB. So 3 MiB should be enough already to buffer ANY data piece on the current network and write whole pieces to disk at once (in single disk write request).
Do we know if Storagenode supports correctly prefixes “MB” vs “MiB”? I realize I set it to “2MB” whereas maybe I should have put “2MiB” in this case, as it’s a quite technical value?
I think it can recognize both variants correctly. At least i set allocated apace in “GB” and “TB” units in config files and they works just fine.
You can check logs of a node startup though. If node can not interpret any of option provided it usually throws corresponding errors to log
Hm. Doesn’t look great. The node is still responding, but the load average is getting worse and worse. It usually freezes the node in the following hour or so when this happens.
Retrying now with --filestore.write-buffer-size="4MiB", before my disk gets full (there is only 190GB left on it won’t be able to run any more tests when it’s full).
Is a disk in a healthy condition? If i read it right disk is doing just 1.5 requests per second (1 write + 0.5 read) and 340 KB/s of data writing on average and still choking?
Is looks too bad even for SMR drive with full CMR cache zone and more like disk having some hardware issues (like some unstable sectors about going to bad).
P.S.
Altrough it can be due to monitoring interval was too short. These “round” number (0.50 and 1.00) looks suspicions. Like if it was just 2 read and 4 write request during 4 seconds interval.