PSA: Beware of HDD manufacturers submarining SMR technology in HDD's without any public mention

Pac · July 15, 2020, 3:34pm

In the end, I played with the --storage2.max-concurrent-requests, and I had to go down to 4 for the disk to stay responsive with no sign of stalling. It’s been receiving ingress for almost 20 hours now (at a pace of approximately 1.85GB/hour) , and it still has decent response times and the load average of the system is stable (between 1 and 2).

So… for me and after several weeks of tests, tinkering and article reading about SMR disks, that’s the only option I found to make sure my SMR node does not crash: limiting its number of concurrent requests to 4. Which is a sadly low number, especially as the disk can perform way way better than that when it is not reorganizing its own data. But once it has no more CMR space available and starts writing SMR sectors, then it can only handle so many (or should I say so few…) Storj requests per seconds.

And although I really get it’s not ideal especially from a customer perspective, SotrjLabs will have to decide whether this is acceptable to have SMR nodes rejecting massive number of ingress so they stay alive.

As already mentioned somewhere in the forum (even maybe in this thread, don’t remember), I really think an improvement would be for the node to automatically adjust this setting depending on the current responsiveness of the disk, because it really feels like the node should start rejecting requests when the storage device can’t keep up, and not just based on a fixed setting like --storage2.max-concurrent-requests.

Easy said, hard to do probably

Anyways, I can finally sleep without the fear of my node crashing at any moment now! Pheww