Any actions required after powercut?

Hey there,

I live in London, so I didn’t think I’d need a UPS, until the building decided to accidentally cut my power for an hour…

When first getting set up, I remember seeing something about how power cuts could cause damage to the data or something similar. So, I just wanted to check -

Are there any actions I should complete after a power cut?

e.g. like running a scan or some sort of program that can look for and subsequently rectify any issues caused by a power cut?

Thanks in advance for any help/reassurance!
Ben

might be a good idea to run a chkdsk.
the problem with that might be that it could take a long time due to the massive amounts of files a storj node can keep.

the problems with power loss can be many to near none at all depending on the hardware and the setup.

the main issue with power loss is critical cache / ram that may not get written, this usually tanks the filesystem, boot partition or other databases in the worst crashes or just losses a few files.

usually if stuff starts up fine, then its most likely fine, tho hidden data corruption can exist.

some hardware, like later gen HDD’s and SSD’s can have PLP as enterprise has had for over a decade, this nearly removes the chance of corruption at power loss.

there is also a physical issue with power loss, this can cause voltage spikes and especially in stuff like brown outs its good to pull the plug completely on electronics to be sure they aren’t damaged from the unstable power.

TLDR
if it works then its most likely fine.

1 Like

I’m not as optimistic. File system corruption is very common in these scenarios. And it can make good chunks of your data unreadable. That definitely doesn’t necessarily mean your OS or node wouldn’t start.

Run fsck / chkdsk to be safe. They can usually repair the damage or reduce it to maybe a single or a few files. While filesystem issues could make entire paths inaccessible.

6 Likes

I’m thinking of installing a backup battery after reading my fellow node operators.

Thanks a lot for your replies!

Sorry for the amateur follow up, but how do I run fsck / chkdsk? :slight_smile:

Are there any docs you could kindly point me to which might be able to help me do this?

Thanks again!

It depends on your OS.
Linux (fsck):

Windows (chkdsk):

macOS

In all cases you need to stop the storagenode service/container before check and fix disk errors: How do I shutdown my node for system maintenance? | Storj Docs

2 Likes

really depends on how stable your power is… for me a UPS is very low on the list, even tho they can be fairly affordable for smaller setups.

because i got very stable power… ofc if i lived in a region with unstable power, it would be higher on the list.

the cheapest solution is one that simply ensures the system shutdown correctly.
so won’t have to be able to power it for hours on end, this ofc depends on the power draw you have… if its low enough like a RPI then a UPS might easily power it for days.

however if you got a proper sized server drawing good amounts of power, having a UPS that is connected to the server and sends a shutdown signal if the powergrid goes down.

personally i went for a PLP in my ZFS Slog SSD, which also helps if the system stalls and one has to pull the plug to get it back online :smiley:
which was more my concern.

or random reboots due to bad driver configurations after i had been tinkering with gpu passthrough… think it rebooted 73 times before i got it fixed lol…

but each storagenode will have different circumstances under which they operate and thus there will not be one approach that is best for all.
aside from getting one of everything ofc… lol
like a datacenter.

Thanks for the additional thoughts - very interesting!

Again, apologies for my noob question here, but would you mind sharing what “PLP in my ZFS Slog SSD” means? We all have to start somewhere in our learning journey 😅

Feel free to just share a link if it’s easier haha - I did try googling it, but didn’t get very far!

ZFS is a filesystem / partition manager gaining in popularity on linux, made for super data integrity and handling like Zetabyte scale filesystems which is where it got its name i believe.

Slog is a Secondary Log / Zil device which in short saves sync writes in case of a power outage or similar issue. (it’s a ZFS thing, but there are similar things for other similar filesystems)

PLP is a hardware feature in many modern SSD’s and even some most recent HDD’s.
its short for Power Loss Protection… basically it has a capacitor (energy storage) so it can save the HDD cache to NAND flash (SSD memory on the HDD) even when power is cut.

1 Like

Ah interesting, so it’s sort of like a way of building in redundancy which doesn’t require a power supply? That’s pretty cool!

Kind of… but not really. The HDD’s can still sustain damage under a sudden power loss scenario and while ZFS has a lot of protections in place, all HDD’s are probably writing data for the same pieces at the same time which could still mess up and create damage. It helps for sure, but it shouldn’t be used to replace a UPS if you don’t have stable power.

1 Like

Understood, thanks very much for the details, really appreciate it!

1 Like