Dead Storjnode disk

mrkeyboardcommando · February 10, 2021, 9:40pm

Sad Sad day.

My oldest node (20 months on V3 and lived even on v2) is dead.
Yesterday i heard clicking of a hard drive and couldn’t find where it was coming from.
Today i came home, went to check the dashboards and didn’t get a response from this node.

I plugged in the HDMI cable and saw a read only filesystem.
Rebooted, nothing…

Unplugged the drive, took it to a computer and held it to my ear… click click off.
SH*T!!!

Shucked the drive (Got it out of its enclosure for those not reading the backblaze blogs ) and repeated the test… again click click off.

Well… It was a good node, almost made 650 dollars with it on V3.

Time to build a new one!

I am sorry for the network needing to replace this data.

fmoledina · February 10, 2021, 9:48pm

Sorry to hear about that. That drop off at the end is such a sad sight.

Good luck with the new node! Come back stronger than ever!

BrightSilence · February 11, 2021, 10:49am

That’s unfortunate, but it looks like it more than paid for its own replacement at the least!

No worries, we’ll take good care of it on our nodes.

ACarneiro · February 11, 2021, 11:00am

I am positively grateful for any traffic I get in sound so!

SGC · February 11, 2021, 11:36am

RIP
It’s valor will be remembered lol

Skyblockpro1 · February 11, 2021, 11:49am

I know how it feels I had 7 month old storage node. The HDD didn’t die as it is in an enterprise server in a raid. Instead I failed on security front. I was attacked by an opened VNC port and a ransomware was placed on the system. since than I have much more secure network. now I’m rebuilding.

mrkeyboardcommando · February 12, 2021, 2:40am

Any opinions on ZFS vs MDRAID?

I know one should just use 1 disk per node… but these are 6 + years old disks that i am planning to use.
Better have some security as it is more than likely to fail.

normally i would choose 1 disk per node, just not with this age of disks.

The new system has more than enough memory and could still be expanded, has 2 cores @ 2.8GHZ (x32-64), and runs on a HP 240 HBA.
6x 6TB WD Reds that came out of an old ZFS machine and have been hammered to death probably.

It is running Ubuntu 20.04.

bovcan · February 12, 2021, 4:02pm

huh, the same almost happened to me when hdd I had node on started to crashing. Somehow I managed to copy node files to antoher disk, it took few days for ~4TB of data. Week or so after, disk died.

naxbc · February 12, 2021, 4:08pm

Same here bro, join the club

SGC · February 13, 2021, 7:28pm

i am really happy with zfs, it’s a very mature solution… so far zfs has only impressed me.

only thing that is really subtracts from zfs, is the problems with adding and removing drives from raidz pools.

the benefit that zfs uses Copy on Write and has multiple levels of checksums on everything cannot be overstated… if fixes one of the main issues when running raid5 type solutions.

raid with 1 redundant drive, usually this is a no go for the older raid solutions, because when you have an error there are two drives that has the data…

so if the disk spits out incorrect data when going bad, then the system has to guess which disk is lying… and without checksums it cannot verify which disk is lying…
thus it will have to use another metric, stuff like smart data and previous issues are then often used to determine which disk is lying.

however this does not always give the correct answer and thus your regular raid5 might reconstruct your raid array based on the bad disk data and thus corrupt a perfectly good array just because it had no checksums for seeing where the issue was when choosing between which of the two disks to trust.

ofc zfs comes with some overhead, and you will want an ssd for a slog device to reduce the io load… tho not strictly required, i would say it’s highly recommended because of the performance benefits it gives… it might even extend the life span of the drives due to roughtly halving the write loads.

i’m unaware of any solultion that comes close to zfs, tho stuff like ceph might be the future of storage, but that is more of a cluster solution.

zfs even makes a difference when running non raid setups, because of its checksums and constant usage of them, then you will be informed the second that there are issues with writing or reading the data…

thus you are much earlier aware of issues with a hdd or a cable long before smart or straight up data corruption becomes apparent, this gives you a real chance to get a head of problems, even when using just a single drive.

personally i will never use a non Copy on Write or non checksum based storage solution again.
but then again i’m not even sure i will ever store stuff without redundancy lol
so maybe i’m a bit biased.

if you are use to zfs, i would say it’s an obvious choice, even for people new to zfs i would recommend them using it, so long as they are technically inclined… i spent a lot of time looking for what storage solution i should use… and zfs was what i ended up deciding on…

it is a not the greatest choice on those mid range 2-4 disk setups tho… if people want flexibility and performance with limited hardware… zfs might be the wrong choice… but for stability and data integrity from laptops and all the way up to using near hundreds of disks… zfs will never be a bad choice… ofc stuff like ceph should be considered when one goes beyond 50-100 disk at 3-4 servers
it might be a superior choice on those scales.

haven’t really gotten around to learning ceph because my setup is still a bit to small for it…

andrew2.hart · February 13, 2021, 7:57pm

Is ceph even allowed for storagenode? I thought only iSCSI…

SGC · February 13, 2021, 8:36pm

that’s a good question, not sure i’m aware of any filesystems being not allowed… but stuff like NFS or SMB doesn’t really seem to work well with the storagenode… not sure if ceph could act the same way, but it’s possible…

know very little about nitty gritty of ceph

Alexey · February 14, 2021, 3:02am

ceph is working. But it’s usually even more complicated setup than iSCSI

andrew2.hart · February 14, 2021, 10:04am

I been considering moving from my current mess of drives, 500GB 2.5", 1TB NVMe, 960GB SSD, 1TB SSD, 1TB 2.5", 3TB 3.5" x 2, 12TB 3.5" x 3 to a ceph of just the big drives (but technically that is running 10 nodes on 3 disks) I could even fit 3 replicas since my total is at about 10TB

The bit I get to is if it should use rbd (disk) or cephfs (files) then I think it would be far better if it was native to ceph and stored objects.

Then I just think “How is that better than a three disk mirror?”

mrkeyboardcommando · February 16, 2021, 11:47pm

if i had the money and the knowledge to run a ceph cluster it would be here already…

disks enough, servers enough… but electricity is very expensive here and we do not get a lot of sun for solarpanels.

and then there is my stupidity… ceph won’t be a thing soon

javierxam · February 17, 2021, 2:28am

What do you think about using ZFS on a regular system without ECC RAM?

Im running my nodes on an older G3258 with 8GB DDR3 and planning to upgrade to a newer i5 8600 and 16GB DDR4, actually using bcache (trying different modes of cache) with 120GB samsung 850 EVO for a pair of shucked WD Elements (14TB & 12TB) on non raid

kevink · February 17, 2021, 6:04am

I use zfs everywhere, even my 4GB RAM PI4 is running zfs. It provides better data integrity due to checksumming. Even without ECC it’s a (imho) superior filesystem because RAM bit flips aren’t that common. (And once data is written to disk, it won’t get modified, unless you have access time still enabled)
Also the cache is nice with zfs but only really helpful for storj databases. So if you move your databases to your ssd, it makes not much sense to have a slog/l2arc for your storj HDDs.
You can read a bit about zfs cache and recordsizes here: Best Record size for zfs

SGC · February 17, 2021, 7:40am

ZFS works fine without ECC ram… infact it will help avoid some of the issues not having ECC RAM will cause… but you will have an error from time to time, but like kevink say… it’s not something that happens that often… mostly it’s because our systems run 24/7 and stuff gets loaded into RAM for days maybe without being read… and then all of a sudden, it might have trouble reading a single bit… and most of the time… well it rarely matters most data isn’t that sensitive to it… like say, an image, a movie… you would never even spot it, in most cases.

some important data might have it’s own CRC which is what ECC basically does, like a hdd will have CRC on all data, so if there was a bit flip it can be corrected.
ZFS’s checksum’s also help counter bit flips… because they are also a sort of CRC.

I’ve been running with L2ARC on my main storagenode… but doesn’t seem very effective to be fair… but thats not really why i want it… i use the L2ARC to extend my RAM so that i never run into memory issues with my vm’s.

a SLOG atleast gives something more by halving your IOPS, but if this is mostly dedicated for storj then i wouldn’t waste an ssd on it…
ofc the SLOG does have many functions… if it has PLP and is set to Sync=always it will basically make your setup immune to corruption from power outages.
since the incoming data will be written to the SSD nano seconds after it hits the NIC, and then if power is lost to the system, the PLP on the SSD will empty the SSD RAM into it’s NAND.

on boot the stored data will then be written to the pool, and the system will continue as nothing had happened… personally i don’t have power outages issues… aside from what i cause myself…
so didn’t want to buy an expensive UPS, so i got a SLOG SSD with PLP and set ZFS to run Sync Always

doesn’t really matter a lot… but i kinda like the idea that i can pull the power plug like 70 times in a row and the machine will boot up every time after without having any errors.

but it’s a luxury feature…

also zfs really likes RAM, so if you want to use the system for other stuff… its greatly beneficial to have more than usual… zfs default is 50% is dynamically allocated for ARC, by that i mean it’s not fixed… if it’s required the system will get the ram for whatever it needs it for… but can basically empty the entire arc, which kinda defeats the point in having it and can greatly decrease the performance.

ZFS isn’t a little filesystem / partition manager but it is currently the only one i trust with my data…

giostino · February 17, 2021, 8:34am

i´m wondering how you guys are handling the monitoring? Got some scripts to get smartvalues to mail or similar?

It´s just a lot of work to create a monitoring for around 8-9 nodes.

kevink · February 17, 2021, 8:52am

Makes it very easy to monitor 8 nodes
(doesn’t monitor smart values though. haven’t found a good and easy approach for that yet)