Please explain TBm

Alexey · March 2, 2024, 11:09am

Backup is useless, as soon as you recover your node from the backup it will be disqualified for lost data since backup.
So, the best usage for these disks - are to run nodes in the same /24 subnet of public IPs - they all will share the traffic for a one node, so the load would be much less in this case and they would mimic a normal work of the one CMR drive.

Eleos · March 2, 2024, 11:23am

So basically 1 CMR node and 2 SMR nodes splitting traffic on the network? Wouldn’t SMR drives also be less performant when it comes to egress when people need to download their files? Like if the drive couldn’t find the specific chunks of files needed for the request before everyone else on the network with the other chunks.

Alexey · March 2, 2024, 11:29am

Every node behind the same /24 subnet of public IPs will split the traffic (if they are not full).
Egress will not be affected.
Pieces are distributed like 1 piece of 80 per segment per /24 subnet.
When the customer is trying to download - it requests 39 random nodes who has 80 pieces of this segment and download them in parallel, as soon as the first 29 are downloaded, all other canceled. Because the client need only any 29 from 80 to reconstruct the file.

Eleos · March 2, 2024, 11:34am

Right so would SMR drives be slower at locating that 1 segment and responding to the request than a CMR drive would be?

Alexey · March 2, 2024, 11:36am

they will never deal with a segment, only with 1 piece of 80 for the segment (this is independent of the used technology). However, since they are behind the same /24 subnet, they will receive as much less uploads each as the number of such nodes, this effectively split the incoming traffic between all of them giving the time to breath for the SMR drive.

So the “solution” for the SMR drive - is to slow down uploads to it unfortunately, no other workaround.

Eleos · March 2, 2024, 11:45am

So the limitation for the SMR drives is the speed of taking in data because of how the technology in the drive works. Egress wouldn’t be limited, just ingress.

Alexey · March 2, 2024, 11:48am

the limitation is inevitable part of the SMR technology - it’s incredible slow on writes.

Mitsos · March 2, 2024, 12:54pm

If you stay inside the SMR disk’s cache when writing (ie you are trying to write 10MB and the disk has 256MB cache) then it will be fast. It will flush the cache to disk when idle.

The issue with SMR disk performance under storj is that the drive is almost never “idle”. There will always be at least one read that will stop the disk flushing its cache. It then needs to wait for the read to complete before lifting the overlapping shingles, take it back to cache, calculate what needs to be changed, then write them to the disk. While doing this, another read comes in and it starts piling up fast. FYI: filewalker reads on an SMR disk are in the KB/s range (no, that is NOT a typo).

The problem is compounded even more if your OS is also on that disk. There will always be some read/write operations going on (ie logs being written).

There is a trick to get the disk to get back some of its “clean” performance: you treat it as an almost full SSD and hence needs trimming. Linux has a timed fstrim for example, which tells the drive that sectors 2345-2369 are empty, there is no need to read them when trying to write sector 2356 (an example).

Bottom line: SMR disks are the modern day equivalent of a tape drive. If you will write a continuous stream on the drive, it will be fine. It will only be read back when you are restoring the file. They shouldn’t be used for any random read/write, hence NOT storj.

Alexey · March 2, 2024, 2:13pm

yes and no. It also rewrite shinglets in a “free” (random) time.

exactly