PSA: Beware of HDD manufacturers submarining SMR technology in HDD's without any public mention

Pac · May 31, 2020, 7:41am

Isn’t it always the case?

naxbc · May 31, 2020, 10:34am

BrightSilence · June 6, 2020, 1:49am

Looks like you were right. Arstechnica tested with RAID6 and whether the disk was empty or had previous data on it. It didn’t fall flat on it’s face. It was definitely slower than alternatives though, but nothing like the ZFS results.

Pentium100 · June 6, 2020, 2:02am

If the drive behaved differently, it would be useless. Consider the initial use for SMR drives - archive storage and maybe backups. Imagine that I buy such a drive, connect it over USB and try to, well, back up my data. If the drive became slow after copying 100GB (or whatever), I would really be annoyed.
“Your drive is 10TB, but you can only copy 100GB of files at a time” wouldn’t be well received

12TB IronWolf is 7200RPM and if you only test part of it, it will be faster than a 5400RPM smaler capacity drive.

But yeah, it looks like WD underestimated the number of people using ZFS for their NAS.

Pac · June 7, 2020, 9:36pm

Do we know if the technical team @StorjLabs is investigating what could be done with regards to the problem with SMR disks, or how to handle bottlenecks issues in general?

From what I understand, SMR drives are kind of becoming mainstream these days, so I bet a lot of home users will have these kind of disks for storing STORJ data. With all the problems that come with them (even though apparently no everyone is having issues with SMR drives… but I surely do).

If it’s becoming a common type of disk, when I see how difficult it is to find out if a drive uses the SMR technology, I’m not sure it would be of any help to tell users just not to use these drives for Storj. Ideally, the Storj software should auto-tune to adapt to storage devices’ abilities.

Not sure how difficult that could be though, especially for SMR drives as they can perform very well for 30 minutes for instance, and then become very slow all of a sudden… The only solution I see would be for the Node to tell sattelites to hold on when its storage device starts stalling… and then get back to them when it’s ready to work again. Something among those lines maybe…

kevink · June 8, 2020, 5:16am

The node would just need to reject new pieces. Shouldn’t that solve the problem?

BrightSilence · June 8, 2020, 6:44am

No, rejecting uploads could lead to customer uploads failing altogether. The only safe way to exclude them would be for satellites to not select them in the first place.

There have already been some updates that would help. You can now move the db’s to another HDD. And with the latest update used serials are moved to RAM which saves some IO as well. I’m sure they’re looking into more improvements as well.

For what it’s worth, the worst thing that could happen is likely your node getting suspended, which will give the SMR drive time for housekeeping and the issue would fix itself. Not exactly ideal, but at least you don’t get disqualified.

Toyoo · June 8, 2020, 9:11am

Being able to move databases to a different path is one of actions already taken, this alone reduces the number of in-place writes a lot.

Pac · June 8, 2020, 12:13pm

@Toyoo Right, but I don’t expect regular home users to tamper with Nodes’ configuration, except for options given to docker for the STORJ address, e-mail address, storage space…

@BrightSilence Yes, moving used serials to RAM is a great improvement I think. Still not enough for my poor SMR drive apparently ^^

Something else looks weird to me when this happens: the storagenode process that usually takes less than 50~80MB of RAM keeps growing and growing when the SMR drive is stalling.
Here for instance, it is now taking more than 700MB:

top - 11:08:41 up 21 days, 13:17,  2 users,  load average: 489.60, 654.72, 948.55
Tasks: 139 total,   1 running, 138 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  0.7 sy,  0.0 ni, 32.9 id, 65.7 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3906.0 total,    184.9 free,    972.9 used,   2748.3 buff/cache
MiB Swap:    128.0 total,     40.6 free,     87.4 used.   3410.5 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                             
25224 root      20   0 1238328 718900  17816 S   4.6  18.0  86:00.56 storagenode                                                         
   98 root      20   0       0      0      0 S   0.7   0.0  42:32.01 usb-storage                                                         
27697 pi        20   0   10188   2788   2424 R   0.7   0.1   0:03.36 top                                                                 
22706 root       0 -20       0      0      0 I   0.3   0.0   0:07.18 kworker/1:0H-kblockd                                                
24489 root      20   0  799176   1176    520 S   0.3   0.0   1:19.51 containerd-shim                                                     
    1 root      20   0   34848   5008   3624 S   0.0   0.1   0:40.70 systemd                                                             
    2 root      20   0       0      0      0 S   0.0   0.0   0:02.16 kthreadd

Why is this? Is this normal?

pietro · June 8, 2020, 12:20pm

My node is using only 60M RAM now on a CMR disk with about 200GB daily traffic.

By the way, you should better run the storage node as a normal user, not root.

striker43 · June 8, 2020, 12:33pm

I‘m also seeing this with my SMR drive. Ram utilisation sometimes goes up to 1.7 Gb but is also decreasing to ~50mb after a few minutes. But as long as the node is running fine… Fortunately I have enough memory

BrightSilence · June 8, 2020, 12:47pm

Are you running 1.6.3 already? If not, you’re still using the used_serials.db.

Pac · June 8, 2020, 12:50pm

I guess you’re right, but I did not configure docker to be usable by the regular user.
Besides, for now Storj is the only thing running on this machine. But good point though, I should take the time to fix this, thanks for the reminder.

Aah. No I’m not, did not realize this improvement would be available in 1.6.3 only. Thanks for the heads up, can’t wait

pietro · June 8, 2020, 1:08pm

That’s not the point. Even if you start the Docker container by a normal user, the container will still run as root unless you set the --user option to other than root.

You just have to set the --user to a UID of your choice (it doesn’t need to be mapped to a user in /etc/passwd) but DON’T FORGET to recursively change ownership of the mounted volumes, otherwise the node will fail because it cannot write anything.

Pac · June 8, 2020, 1:25pm

@BrightSilence sorry, noticed only now that you had given me the info about v1.6.3 here: Machine freezing for short periods - #67 by BrightSilence
Cheers! and sorry

@pietro Ah okay. Sounds a bit dangerous to tamper with all that now but I’ll think about it. Many thanks for these details.

Toyoo · June 9, 2020, 1:48pm

https://www.en24.news/2020/06/nas-hard-drives-synology-now-lists-wd-red-hdds-as-incompatible-with-smr.html

Not sure though why they list these drives as incompatible with SMR (-:

BrightSilence · June 9, 2020, 2:00pm

That entire article looks like a badly translated mess.

But I went right to the source.
https://www.synology.com/en-global/compatibility?search_by=category&category=hdds_no_ssd_trim&filter_feature=SMR&p=1

Both WD SMR drives are listed as incompatible with all models.

Tough break for WD. But I would probably do the same thing if I were in Synology’s shoes.

Mad_Max · June 9, 2020, 5:53pm

Yes, but only new (empty) drives which never were filled with data before can do direct SMR writing. Because there is no equivalents of SSD TRIM command to tell DM-SMR HDD drive which data was already deleted and which still in use. And thus SMR drive consider all sectors which was written at least once containing useful host data and can not just overwrite it during write to adjacent SMR tracks even if user already deleted all data and thinks his disk is empty now. Once filled SMR drive still need very slow read-modify-write process resulting in huge write performance degradation if CMR cache is overfilled.

LinuxNet · June 9, 2020, 6:21pm

Hi. Until recently I had a Seagate Expansion 4TB (External, USB 3.0) for my first node. With increased traffic I had an incredibly high IOWait. The dashboard took up to 30 seconds to load.

Since I switched to a Western Digital Elements 8TB (External, USB3.0) everything is fine. run 2 nodes with one element each and both nodes run great.

cdhowie · June 9, 2020, 7:46pm

Many DM-SMR drives do indeed support TRIM, for this exact purpose.