Anybody using WD Blue drives for a storagenode?

donald.m.motsinger · April 16, 2020, 8:45pm

Hi there,

is anybody here using current models of the WD Blue series for a storagenode? These drives are using the Intellipark “feature” which parks the head after 8! seconds. With older models you could change or disable the timer with the wdidle3 tool from WD or with idle3 under Linux. This doesn’t work anymore with current models like the WDC WD60EZRZ-00GZ5B1. The result is very fast increasing SMART Load_Cycle_Count value.

The question is now, will using such a drive as a storagenode prevent the head to park so aggressively, since the storagenode constantly reads and writes to the drive?

Can anybody who uses such a drive share the SMART values “9 Power_On_Hours” and “193 Load_Cycle_Count”?

Pentium100 · April 16, 2020, 8:57pm

WD RED drives (not on my node, but on another server) do the same thing, just instead of 8 seconds they park the heads after a few minutes, so I wrote a script that accesses them (ioping) once a minute to prevent that.

You should probably just run ioping in the background and keep it running.

donald.m.motsinger · April 16, 2020, 9:05pm

At the moment I’m using this drive for backups only. I tried something similar a while back with a script which wrote a small file every few seconds and still the Load_Cycle_Count increased quite rapidly. Maybe I’ll give it another try with ioping. I thought somebody uses such a drive and could share his/her experience.

kollo · April 16, 2020, 9:22pm

Simple touch some pre-created file on a partition run in intervals via cron will do the trick.

Parking a head is one thing. Spin down of platters is so much worse. On a USB external drive this technique solved a problem for me on a different project.

If you are minimalistic and don’t want to waste IO for such command in times when traffic is high then, You can combine this method with some monitoring tool and prevent spin down only when traffic hit lows for longer periods of time by rewriting cron file for example (unless you use that computer for other stuff then switching cron file could be annoying and you need slightly different approach).ゅ

donald.m.motsinger · April 16, 2020, 9:40pm

That’s what I did every 5 seconds, but it didn’t help. Maybe it wrote to the cache only.

Pentium100 · April 16, 2020, 9:47pm

try this
dd if=/dev/zero of=/mnt/testfile bs=4K count=1 oflag=direct
oflag=direct should bypass write cache.

Itsmenh · April 16, 2020, 11:59pm

Well I’m using 3 wd Blues, 2 wd reds and 1 wd purple in unraid and all of them are running without a Problem for nearly 1 year now.
I should add that im using the storagenode spread across the drives and that These are 4tb drives.
Disk1 blue (used as external Drive before I used IT in my Nas):
Power on hours 7351 (10m, 1d, 7h)
Load cycle count 210216

Disk2 blue:
Power on hours 6580 (8m, 30d, 4h)
Load cycle count 164491

Disk 3 blue:
Power on hours 33 (1d, 9h)
Load cycle count 88

Disk4 purple:
Power on hours 6587 (8m, 30d, 11h)
Load cycle count 15

Disk5 red:
Power on hours 3064 (4m, 5d, 16h)
Load cycle count 186

Disk6 Red parity:
Power on hours 6577 (8m, 30d, 1h)
Load cycle count 1078

kollo · April 17, 2020, 6:33am

@Pentium100 's is more elegant. Another approach is something like
echo "log" > /mnt/testfile

donald.m.motsinger · April 17, 2020, 11:14am

Thanks for all the answers. I tried all the suggestions, but only “ioping” and “dd” (with and without ioflag) worked. With “touch” and “echo” the SMART value kept increasing.

How I tested:
in terminal 1 I requested the smart value every 20 seconds
while true; do smartctl -a /dev/sdf|grep Load;sleep 20;done
in terminal 2 I ran the command every 7 seconds
while true;do dd if=/dev/zero of=/mnt/backup/keep_awake bs=4k count=1 oflag=direct;sleep 7;done

@Itsmenh: Thanks for your stats. As you can see disk 1 and 2 have already very high values for Load_Cycle_Count. I read somewhere that they are rated for “only” 300000.

I bought 2 of these drives as external “MyBook” drives for my backups on offer for ~£100 each. The external cases I sold on Ebay for £20 each. So £80 for 6TB is not bad

donald.m.motsinger · April 17, 2020, 12:24pm

In case someone is interested in my little script. Just fill in all the affected hard disk models you have in your system into the “model” variable of the script separated by a space. ioping needs to be installed too.

#!/bin/bash
models="WDC_WD60EZRZ"
killall ioping 2>/dev/null
for model in $models; do
  disks=$(ls /dev/disk/by-id/*$model*|grep -v part)
  for disk in $disks;do
    ioping -i 7s -q $disk &
  done
done

Disclaimer: as always, do it at your own risk

SGC · April 17, 2020, 1:00pm

how much latency does this parking of the head add?

donald.m.motsinger · April 17, 2020, 1:42pm

I don’t know and that was not my main concern. I’m more concerned that the drive might reach the max Load cycle count within a year or two.

Pentium100 · April 17, 2020, 1:44pm

On a WD RED it’s a few seconds. Noticeable when trying to open a share.
But more than the latency, the drive could wear out prematurely loading/unloading the heads.

SGC · April 17, 2020, 2:14pm

maybe, ofc if the designers made the function commonly used one should think that in most cases they would take that into account.

some of my drives have been in production for like 9 - 10 years… i would advice to buy proper disks for production use cases, but aside from that, i wouldn’t worry to much about drives dying unless if its a known fault of the series you have.

latency on the other hand, that’s more of a concern for successrates and then disk utilization and cooling is what i worry most about, i don’t even think my drives would get time to idle if i allowed them to… storj seems to keep the array i got running for it, very active…

also its not because a park mode doesn’t have advantages, like much less power usage… could be like 50% power saving or so…

donald.m.motsinger · April 17, 2020, 2:46pm

Look at the stats from @Itsmenh. Disk 1 was used as external disk before, but disk 2 was used for 9 month (as storagenode I assume) and has a load cycle count of 164491

Saving how much? 50% of 5 or 6W? That’s about 25kWh/year or less than £4. I’d rather keep this disk for longer than 1 or 2 years than saving £4 a year. I don’t use a WD Blue as storagenode yet, but if could get another 6TB for effectively £80, then I would spin up my next node now that I know how to prevent these disks from parking their heads so aggressively.

Pentium100 · April 17, 2020, 3:35pm

The head park “feature” is there for two reasons - to reduce the power consumption (especially useful in a laptop, though sometimes annoying) and to make it more difficult to use the drive in a server (together with the lack of TLER).

However, the additional wear on the drive makes it fail faster, so, unless the drive is idle most of the time (which it may be in a normal PC) the head parking should be disabled. Especially if the drive is accessed often (but not often enough) and has to load/unload the heads every few seconds.

SGC · April 17, 2020, 5:30pm

tried to get access to my smart data through my lsi megaraid, and tho it should be possible i kinda gave up, but ill check my smart when i get my second HBA so i can get rid of my raid card.

i dug a bit into the Question of Load / unload cycles first off there are apperently two different places they count that in SMART depending on drive brand and sometimes even models, as in the case of WD.
online i found people talking about their drives having 2.3million to 2.8million load cycles, so if you got 200k in 10 months then a drive could last for 10 years, which seems a perfectly valid life time for a drive… most drives rarely live beyond 10 years of power on time…

sure when i check my own old drives consumer drives they have 5k load cycles in 3-4 years of power on time.
the old drives could damage their heads by going in and out of the parking position, but if that problem is fixed today, then its kinda irrelevant how often a head has gone into parking position, aside from latency related stuff, and again seems odd if it should take much longer for a head to go out from parking so long as the disk is spinning… atleast compared to spinning the drive up.

the issue of load cycles also is an old one, its been discussed online for the better part of 10 years, if it was a real issue, i’m sure wd would have made changes to fix it by now…

so sure maybe its a thing, doesn’t seem crazy relevant, an artifact of how older disks was to maintained for long life…

personally i don’t ever spin down my drives in my server, since i’ve been advised against it, because that is one of the likely failure points… which also kinda makes sense with my understanding of electrical engineering, since an electric motor is stressed and draws the most power during start up… and can burn out its coils if it cannot spin up, ofc depending on a lot of factors.

if you want to get a good evaluation of lifetime of drives before you buy them then i can recommend the annual blackblaze disk reports, they are quite good as a yard stick for whats good and whats bad.

i wouldn’t mind there being an easy answer for long disk life, but there rarely is an easy answer for anything…

and the more one dig into it, the more the whole question and answer dissolve into other things that might be even more relevant… such as how many heavy trucks or tractors drive close by, creating vibration while the drive is running… personally one of the thing i’ve found that kills most drives for me… is that i often store drives with the print board up on tables and such… for some reason they tend to die from that at a steady state…

i should really start trying to turn them print side down… the idea was kinda to protect the print from being scratched…

you want to know one thing that kills drive… high usage… i’ve seen it often and i believe its partially why so many people need or want to run raid 6 or better, because that when you ask for max performance from an old drive… it much more often goes terribly wrong… ofc it doesn’t help that old drives in new computers mean they will need to run at max output to keep up… been thinking about putting a limit on my drives max output, because i would rather have them live longer than output more mb/s… i mean i’m running small arrays for now… but really how often do i really need 500-750MB/s from my 5 drive array, seems its just putting needless stress on them.

Pentium100 · April 17, 2020, 7:58pm

On some servers this works: smartctl -a -d megaraid,0 /dev/sda
the “0” is drive number as the controller sees it (slot number etc).

Load cycles still damage the ramp heads are parked on. Also, WD itself recommends making Linux write to logs less often so as to not accumulate too many load cycles.
https://support-en.wd.com/app/answers/detail/a_id/23841

SGC · April 17, 2020, 8:26pm

Ignorance is bliss … all this talk about drives inspired me to take a look at my storage array, found out that i may have been running for 6 weeks without my “redundant” drive active in my zfs pool… O.o woops
still trying to get the hang of this, better get some alarms setup.

pretty sure i know why tho, had some trouble with the array after i removed a drive shortly after i set it up, and also moved the drives around in the bays, just to ensure my raidz1 was working correctly.
always nice to have some fault tolerance…

i tried the smartctl -a -d megaraid thing… didn’t have much luck with it
not to comfy in linux just yet and have little clue on which ports my drives are running, i know which type of hdd’s that are on which “virtual disk” on the raid controller… but no clue just tried from 0-16 as N on one of the /dev/sd_ still didn’t work…
getting HBA’s in it soon, so no point in wasting my time trying…

yeah it’s a good idea to follow the manufactures recommendation, but they do also say in that post that the drive is rated to 1000000 cycles, and people record online number of 2.3-2.8m.
so yeah i doubt it’s worth much headache on trying to avoid it for increasing life of the drive, can’t hurt tho… and one does bypass the latency created by it… however much that is

i would go the advanced power management route, a script would created extra io and interrupts in the system, ofc for some it might not mean much… but i prefer sleek solutions if at all possible.

Pentium100 · April 17, 2020, 8:48pm

I have a 1GB hard drive made in 1992 that still works. Doesn’t mean all drives of that model would work now :).

The drive can probably take more load cycles, but if you reach the “rated” count it may be easier for the manufacturer to deny you warranty if the drive does fail.

It does not look like those drives support it and the script works for any drive. While the script would add some io and interrupts, unless your server and drives are near 100% loaded, it should not matter.