Change from SMR to SSD drive

well you do get some big performance benefits… the jury is still out on if running a raid5 node of 3 drives will be better or worse than running 3 nodes…

if we look at the numbers with 3 nodes you get 50% more capacity, which is always very nice…
but each disk deals with each node, so if the data being downloaded from one node, then you are restricted to the performance of 1 disk… giving you essentially 33% performance in some cases, which granted might not be applicable for test data…but still it’s something i would consider when “designing” / selecting my setup.

with 3 drives you would also lose 33% of the data if a disk died, while on a 3 drive raid 1 you wouldn’t loose anything… also your write performance which is something that is very important if one wants good or decent upload successrates, would also be much better, actually nearly double as good in a raid5, granted some things won’t be better.
but still i don’t think i ever seen successrates below 70% with my raidz1 arrays of granted more drives…

also with higher successrates you would then also limit bandwidth usage on cancelled pieces… i duno how big a problem this is tho… but if we say 40% is the avg of an uploaded file that fails… maybe even less but still… whatever lets say 10% to make it unreasonably low see what the math says then.

so if you got that SMR 20% successrate on uploads, meaning 80% is cancelled, and out of those 80% 10% of that is uploaded before its cancelled, that means 8% extra bandwidth added… on the 20% we actually had successfully… so thats nearly 1/3 of his bandwidth that might be used on just dealing with files that are cancelled because his harddrive is to slow…

so i know people argue against raid, and it might be justifiable, but there is also a counter argument to be made… sure is easier to manage and expand with a node on ever drive… and if one got the internet bandwidth to spare… i’m unsure if there is much advantage by not doing it that way… but it could lead to bandwidth congestion… anyways it’s never a simple one type solution in all cases…

the only thing that wouldn’t make sense for a storagenode atm, would be a mirror…
half the write performance, double the read, and 50% capacity lost… xD thats just the worst possible fit imo, but mirrors are nice to manage… seen from a sysadmin perspective… so if one had enough drives then it might be a choice…

Keep in mind that disks have much higher read performance than write performance. For really hot (frequently-requested) pieces, the system’s I/O cache will likely hold those in memory anyway.

This is incorrect when you consider the capacity of each arrangement.

With a 3-disk RAID5 you have the capacity of 2 disks. If one disk dies, you still have the capacity of 2 disks (no data loss).

With 3 disks each running their own node, you have the capacity of 3 disks. If one disk dies, you drop to the capacity of 2 disks – which is your capacity with RAID5 anyway.

Yes, you lost 33% of your capacity but that’s capacity you didn’t even have with RAID5 so you can’t compare them like that. All that happened was you earned a little bit extra until that disk died!

Most of the information I’ve seen suggests that RAID5 is actually slower for writes than a single drive because individual blocks cannot be written without a parity calculation. In particular, I/O on the databases is going to be significantly impaired – writes to a single database page are going to require one disk read and two disk writes instead of a single disk write: 3 IOPs vs 1, though the writes can be parallel so it’s the timing of 2 IOPs vs 1.

Yes, I do agree with this (and you’ll see me mention it in most of my posts that touch on this topic). However, keep in mind that with one giant node you have significantly more risk whenever you make any changes.

With a dozen one-disk nodes, it is a hassle making a config change to all of them, but I can make the change on one node and see what happens. If I make a critical error and cause my node to be disqualified, I lose only one of my nodes. I can figure out what I did wrong and not make the same mistake again on my other nodes.

With one giant node, a single human error can disqualify the whole thing at once and I have to start again from scratch.

Giant RAID nodes are easier to manage.

Node-per-HDD is harder to manage, but is safer and gives more capacity.

3 Likes

Hi all,

in all this SMR-discussion the second (and to me important) question of stuu was:

Does the network realize an performance update and the node’s reputation is updated or is is more clever to set up a new node?

I am eager because I am switching to glasfiber (FTTH) in a short while. The performance of my node will raise dramatically. Right now I only get 45% successrate with my poor internet connection (35/5 Mbit).

By the way: I also used SMR-HDDs. 2x wd red 6TB in a Synology in a Raid1. The Storj Performance was terrible. At the same time I had another node running on a differen IP with the same bandwidth and loaded with the same amount of data (6TB). It performed way better. Speaking in numbers: The SMR-node earned something around 60-70% of the CMR-node.
As I got rid of the SMR-Disks (hotswap) and inserted 2x ironwolf 6TB (CMR) the node performed exactly like the comparision node.
The next bottle neck would be the internet cable, not the disk-speed. So I also see a SSD as waste of money.

1 Like

Reputation is not based on performance as far as I know. It’s only based on whether your node passes audits. The network simply does not care how fast your node is as long as it can verify that the data you’re storing is intact.

I get only 39% upload successrate with a 1000/50Mbit connection in germany. The performance of a node is not related to your internet connection (unless it is too slow of course).

yes the network will utilize the new bandwidth it can, if your disks can keep up…
yeah avoiding SMR drives is the first line of defence, you might be able to firmware update the SMR drives if they are not very new… in recent time manufactures have gone to great lengths to improve the drives, by making them smarter… not sure how well this has worked, but there might be something to get there if you are not scared of such a task.

else establishing a cache to buffer… duno what you are using, but systems with tiered datastorage can utilize SMR quite effectively… because all incoming data will hit higher tier storage solutions and then as it’s not used the data migrates to the slower SMR drives which might be painfully slow to write, but they will read just fine on par with or maybe even better than CMR hdd’s
SMR usually has a huge cache, which in theory could improve their performance …

anyways if you got SMR drives and duno what to do with them because… lets be honest… they suck at writing … then look into tiered storage… something as “basic” windows storagespaces actually supports tiered storage… i for a long time considered using that for my storagenode before i went with zfs… and to be honest i may end up trying it out just to compare… there is a certain elegance to be able to add and remove drives of any size from a storage “array” blob or whatever the experts wants to call the tiered storage solution that is behind microsofts storagespace…

i’m sure there are other solutions for linux like that and such… i’m just not familiar with them…

The amount of traffic you get is determined by how nodes are selected for upload and download. Currently performance of the node is not a factor in this process. So there is nothing for the satellites to detect. Your node will perhaps see better success rates as a result of the upgrade, but there are many factors that can impact success rates. So only testing it will tell.

2 posts were merged into an existing topic: Hdd drive recommended for start

A post was merged into an existing topic: Hdd drive recommended for start