I noticed that the WD/Hitachi UltraStar HC620s are enterprise class SMR with 14-15 TB
https://www.anandtech.com/show/13523/western-digital-15tb-hdd-ultrastar-dc-hc620
I noticed that the WD/Hitachi UltraStar HC620s are enterprise class SMR with 14-15 TB
https://www.anandtech.com/show/13523/western-digital-15tb-hdd-ultrastar-dc-hc620
about the SSD, it“s my understanding that they will wear out over time, due to the write/read cycles they can sustain. So, I think this is something more to take into consideration.
I“m contemplating the possibility to build a RAID 0 with two 4TB HDD. It“s the best cost effectiveness I found. I also think better of them be 5400rps for lower noise and power consumption, while countering the lack of performance with raid performance. But since the companies aren“t being open about the SMR, can someone just recommend some HDD which are for sure not SMR?
Why would you want RAID 0? One disk failure causes the whole RAID volume to be lost, there is no data redundancy. I agree about SSD - we had to replace all of our firewall SSDās with HDDs due to early failure from excessive read/writes.
As far as good HDDs I am favoring the data center class HDDs that are rated at 7x24x365 service and 550TB/yr. For Western Digital they are branded āGoldā or 'HC500"
IĀ“ve just read a +50 large posts topic bulls fight about the issue. ItĀ“s clear there are not consensual. IĀ“m aware of the risk, but I think it is not that high. (I only had a broken HDD in all my life, and it was like 20 years ago, I assume technologý has improved). I just think itās worth the risk.
2x4TB I guess is a good sized node, and it“s the best combination to get such quantity of TB without bankruptcy. Plus, I“ve got the increase of performance of the raid, which is something to take into consideration as we are talking about 5400rpm (I really love the most possible silence and it“s less (I really love the most possible silence and it“s less power consuming). I see I“m repeating myself, sorry.
thanks for that recommendation, but those have a really bud TB/$ ratio. They could be great! The only one I“ve found is 92⬠(100$) and it“s 1TB. Definitely not profitable for storaJ.
but thanks anyway
I have seen quite a few failed hard drives. Once both drives in RAID1 developed bad sectors - I managed to sync my data to a new array only because the drives did not have a bad sector in the same place.
It really hasnāt. I have quite a few older hard drives that work OK, but newer ones seem to fail faster. New drives have much tighter tolerances than older ones for example.
If you do not want to lose capacity by using RAID (other than 0) you can set up two nodes, each for a separate drive. That way, if one drive fails, it will only take out that one node.
clearly you never worked with harddrives actually being used, most regular people donāt really experience harddrive failures because it does take massive amounts of usage to wear down a drive, aside from when itās a one off dead on arrival type dealā¦
just like most people wonāt see a car wear out on them, doesnāt mean it cannot happen, nor that it doesnāt happen⦠it just means they donāt really use their car that much⦠and maybe keep it in a nice garage, and doesnāt drive it to bitsā¦
harddrives wear out, and depending on use case they can pretty much be dead after 4-5 years, however most good disks with moderate or even heavy workloads can survive 10 years before they hit a wall, usually in this case tho the wall is often that technology has advanced.
running a raid 0 might get you through a year or two with a bit of luck⦠but donāt expect it to live beyond that, and thats most likely being generous with odds⦠running a storagenode is a fairly heavy workload on the drives⦠unless if you got more than a server worth of drives running itā¦i would sayā¦
maybe if its 10k or 15k drives itās better⦠havenāt tried those on it.
but using 5400rpm drives⦠well iām sure they are nice and quiet⦠but they are also like 7-9ms seek time or so
and on top of that you wonāt know if data is bad before you read it back⦠and no you cannot just check 5 minutes in a video file to assume itās good⦠i know for most people that is fine⦠and video is surprisingly redundant to data loss, and on top of that whatever data that is lost⦠our brains disregard anywaysā¦
also there are different grades of data⦠yeah sure you want data to be good⦠but like i say, in a video file⦠you might not even notice if 1mb is missing if its randomly all over the fileā¦
while if that was your bank informations⦠then you would damn straight notice⦠ofc most donāt have the bank data of a movie⦠but the concept is soundā¦could be a shorter videoā¦
The discussions about RAID vs one node per HDD are about RAID with redundancy. Usually either RAID5 or RAID6. There is absolute consensus on RAID0. No matter on which side of the RAID argument people are, nobody would advise RAID0.
Iām not sure that is representative. It fluctuates a little, but Iāve seen no indication of it getting significantly worse over time. Backblaze is showing a big improvement in their latest numbers.
Thatās a bit of an exaggeration. Assuming the use of modern disks within their first 5 years of lifetime, There is about a 2% annual failure rate. Though this goes up over time. A RAID0 would double the risk as well as double the loss. So 4% failure rate on average. However, after 5 years the numbers get a little less solid, but Iāve seen failure rates mentioned close to 8-10% per year. Itās still not smart to use RAID0, because why should you double any failure rate and loss when a failure occurs. But it doesnāt seem to be as bad as you suggest.
well its not because i think the drives will die, but with the load a storagenode puts on them, they might⦠not easy to say without testing long term, however it doesnāt take to many errors to take down a raid 0 array, and thats what i think is most likely to happen⦠but hey he will get great node performance for a couple of years maybe more with some luck.
also i donāt want to say a number that isnāt realistic, its better that he is pleasantly surprised rather than overly disappointed, and array that survives doesnāt really affect stuff so much as an array dyingā¦
the blackblaze number are very interesting⦠tho one has to take into account that they vet their drives in smaller batches before buying large numbers of the drives they considered a good investment, so their numbers might not be full representing the real failure rate of hddās in general.
but i do like to check if some of their number correlate with what i am buying, if possibleā¦their numbers ofc is a great representation of the drives they do give us numbers forā¦
on another note it could be that the drop in hdd failure rate is because of better drives overall, hddās are a market that would be in decline and thus manufactures will need to find new ways to get customers to buy their devices, reliability seems like a sensible marketing and engineering path for hdd, as the ever increasing pace of technology jumps towards making them entirely obsolete.
Running a node on each drive would result in the same performance as the load would be split between the nodes. Also, if a drive failed, it would take out only one node.
maybe, how can you know how much successrates will matter when test data stopsā¦
unless if storj want to rebalance the data around the network all the time to make sure its all evenly used⦠which they would essentially have to pay forā¦
nobody can know currently if the higher successrates is worth more or less than the lower end⦠my bet is on the former because the faster connection people have the more likely they .1 have more money or .2 have better gear or .3 is using it for a professional use case.
you could be right, you could be wrong⦠i doubt anyone can know presently aside from those controlling the test data, and that cannot last foreverā¦
This is definitely a good point. They have been unable to catch big mistakes in the past though and had a few HDDs that performed considerably worse in large use. I wish there were more good sources on HDD stats, but I havenāt found any with anywhere close to as much data as backblaze.
A lot of this also depends on which models they have in large numbers of use. I donāt know how representative it is for the broader market. So Iām a little careful assigning any broader trend to their numbers.
Iām pretty sure in recent years they tend to market storage space per rack and density over reliability. Just based on the marketing materials Iāve seen. Reliability is covered for large part by warranty, especially for large business solutions, for which replacing HDDās basically doesnāt cost anything as it always happens within warranty. So unless the failure rates get crazy high it doesnāt matter too much if they are slightly higher or slightly lower than normally. And consumers customers probably donāt pay attention to this at all most of the time. So Iām not convinced reliability is actually that much of a marketable trade. Itās more like a baseline expectation.
well, I didn“t catch that, thanks for the clarification. But if one assumes the risk of raid1, how could not be the same for raid0⦠if I have a 2% risk of disk failure, I can“t see much greater risk of failing one of two. Can“t remember my statistics lessons, but I imagine we are talking 1%, or 4%. Either case, good number for me, taking into considerations the advantages.
if I survive for 2years I reckon to renew the disk could be a good thing. At that point the node should be generating enough, egress for that to be worth it, or else all this thing is not profitable at all. Item more, in 2-3 years there should be cheaper or bigger disk comparatively speaking.
because its improvable enough, and I“m having almost double i/o velocity?
well, this actually I have not idea. Is that so? in that case, how can anyone use this configuration? it“s a pretty common used, so⦠I“m a little confused. This could change all my calculations, of course. ¿it“s really that unreliable?
I thought I got this right. My first intentions were to build 2TB nodes. After careful reading and studying I reach the conclusion that you need more stored data to have more probability of egress traffic. So 2 nodes of 4TB would be much worse than one 8TB node. wouldn“t it?
man! I“m discussing raid in here. I feel like a full member now XD.
thanks for all your insights.
No, 2x4TB nodes under the same external IP (or the same /24 subnet) would essentialy be aggregated into a supernode by the satellite. The data would be split between them and so would be your egress. The total amount of data and egress your two 4TB nodes would get would be exactly the same as if you had 1x8TB node (in rare cases you might even get more data).
However, if one drive fails, that node would be disqualified, but the other one would not. If you use RAID0 and a drive fails, your whole big node gets disqualified.
RAID0 - files are split into two and stored on separate drives. One drive fails, all you have is half of each file.
RAID1 - files are stored in two copies - one on each drive. One drive fails, you still have a full copy.
RAID0 was only ever used to get a performance boost if reliability of data doesnāt matter. But with SSD relatively cheaply available now, I donāt really see RAID0 being used anywhere anymore. There are simply almost no use cases where it makes sense. And yes, it is really that unreliable. A read error on either disk would lead to lost data and any disk failing would lead to all data being lost. And since there is really no advantage for running one larger node vs two smaller ones, there is no reason to assume that additional risk at all.
I donāt monitor success rates, so I have no data to compare to
As said, the date when i stopped getting lots of ingress to this node was some weeks after I switched to the SMR drive, so there doesnāt seem to be any connection to that at all.
TLDR in your case and with your knowledge i would say find a 3rd drive and do a raid5 ⦠then you can always loose a drive and simply replace it, if that isnāt an option do two nodes one on each drive.
[rants and reasons]
This is a gross simplification and an erroneous one at that, raid1 vs raid0
raid0 will suffer horribly when writes errors or problems of any kind are encountered, because data is striped across both drives only one needs to fail to ruin a lot of the data⦠it might not all get corrupted, but with only a few read / write errors you will see (i duno exactly) maybe 25-50% of the raid0 becoming corrupted.
with raid1/mirror you can loose either disk or maybe 2-5-maybe even 10% before odds that the data overlaps with the damage on the second disk and thus corruption starts to take effect, however since this isnāt a striped volume either, then corruption isnāt to big large stripes across both drives, but to individual parts of the disk containing individual files or folders, thus loosing data on a raid1 is directly due to outside effects like lightning strikes or water damage, or other general negligence from the sysadmins side⦠raid1 is considered one of the safest ways to store your data.
so if we say 2% chance of failure on a drive, then the odds of a raid1 failing in the first year on disks that are tested good before put into production, will be like 2% of 2% and thats a pretty high number even because, even if both drives fails not catastrophic you will be likely to be able to recover most of the data.
while raid0 again with good drives have 4% change of catastrophic failure of a drive, but even just the slightest issues with cables, backplanes, or general bad sectors, damage from vibration or accidental bumps, can often kill your raid0 dead in a heart beatā¦
but if you are sure you know what you are doing, sure raid can have its use cases, and this is only a store for basically replaceable data, sure you could get away with running a raid0⦠for a while⦠but raid0 will almost always fail critically or be subject to large amounts of bit rotā¦
since you donāt seem to know what you are dealing with, it might be a terrible idea, if you want to keep a stable node⦠if you got two drives⦠maybe you are better of getting a 3rd one and doing a basic raid5 with 1 node⦠or two separate nodes, on top of that⦠if you expect to use an old SMR drive to do this⦠then itās doubtful that it will truly help you cause⦠you might be better of looking at doing some sort of cache solution to make more sequential writes and reads of the drive.
yes raid0 is the worst and should never be used to stable data storage⦠its like a performance tuned racecar⦠the mileage isnāt great, but it will surely get there fast⦠but it wonāt last longā¦
raid0 are often used for various temporary data buffers, where the stability of the data is of no concern.
yeah i was looking at drive last night because i had one i thought was dead, and found a seagate that was cheap and blackblaze was using a version with a 0 instead of a 3 at the end o the model nameā¦
then when searching around i found another model that was name with an extra 0 ⦠so 000 or 0000 and then 3⦠difference in price was about triple⦠and same size⦠so yeah better be careful with those model numbers lolā¦
well you supposedly can get 60tb 2.5" ssd drives now⦠so my thinking was maybe high number of rewrites / reliability because the hdd tech is very established⦠ofc ssd are basically RAM in raid configurations or such⦠so i guess that is also a pretty well know tech⦠but still hddās are much less changed than this technology and doesnāt need to scale⦠so maybe that could be what they will try to market them on in the future, because they lost the capacity front, now itās only price and reliable high number of read writes.
ofc HAMR is also on its way for hddās which may change it once again⦠but one would kinda doubt a magnetic version of vinyl records could ever compete with laser light printed circuit boardsā¦
but iām not really qualified to guesstimate that⦠but thats my opinion
Thanks for the advice.
What are those limitations? I was thinking about buying SSD
The chief concern is SSD lifetime or endurance. Considering that Storagenodes will put constant demand on storage the duty cycle is critical.
Or get a 3rd drive and run a 3rd node on it.
Thereās really no reason to use RAID5 unless people are giving you free drives left and right, they all happen to be exactly the same capacity, and you canāt manage to fill up your node. Otherwise any redundancy is just wasted capacity that you could be using to run another node.