"Storj disk" falling down

evanmac · January 26, 2022, 1:59pm

Hi all,

my 6TB “storj disk” is falling down, I just bought two 6TB HDD (RAID1) to replace it.

Now I’m trying to copy everything from old disk to the new one, but the old disk keeps failing and I don’t know if I eventually achieve to copy everything on the new RAID.

What if I put online the node with incomplete data (or no data at all)?

Many thanks in advance, Nicola

Stob · January 26, 2022, 2:29pm

Hi @evanmac,
You should try as long as possible to transfer the data from your old disk even if it is failing. The more data you can move across the more likely the node will continue to function and not be disqualified.

If the disk is failing you should have stopped the node already and just focus on transferring the node data and identity. You can take up to 14 days with the node offline before the offline time becomes an issue.

There is also mixed reasoning as it may be in your interest to be offline longer (still up to the 14 days) so StorJ repair tasks are initiated, causing some missing data from your node to not ever try to be read again - thus you would miss out on those segments being counted against your node disqualification.

SGC · January 26, 2022, 3:28pm

you could try to use some data recovery software, when a disk is operating normally it will give up on reading damaged sectors after a while…
recovery software will change parameters so that it might take much longer to retrieve the data…

also i find that often limiting the speed of the a transfer greatly improves the odds of data being retrieved

EaseUS has a pretty good data recovery software is pretty good… but that is proprietary software.

evanmac · January 26, 2022, 8:15pm

I think that’s a mechanical problem, I have EaseUS license, but the disk seems problematic

evanmac · January 26, 2022, 8:16pm

As far I can see, there’s trouble copying data front the “blob” folder, other data are copied without problems.

What if I put online the node with this “blob” data missing?

jammerdan · January 26, 2022, 8:24pm

The blobs folder contain all data that the node is storing.

Alexey · January 26, 2022, 8:28pm

It will be disqualified within a hour. You can lose anything else, include databases, but you need that blobs folder and identity folder to start your node.

evanmac · January 26, 2022, 8:34pm

nice there, I’ll give a try with EaseUS

SGC · January 26, 2022, 8:51pm

blobs contain 99.9% of the data so it is also the most likely to show problems…
the other files can also have problems, even a retrieved file from a bad hdd or even a good hdd isn’t a sure fire thing.

but like alexeys said, identity and blobs is critical information… max loss that is survivable is like 12% of the blobs or was… not sure if the methodology has changed yet… but you can atleast loose some of the blobs, but you will need most of them and with a smaller node one might be able to outrace DQ with growth.

but ofc first you have to get the storagenode out first…
yeah try EasyUS if anything can fix it then their software can, it’s no magic bullet tho… and there are even more advanced stuff… but one quickly runs into budget and research time issues.

evanmac · January 27, 2022, 10:56am

Update: EaseUS is working ont the disk, it tells that reamaining time is something more than 63 hours, it saw 3,2 TB on the disk.

Let’s see what he can recover, expecially in the blobs folder

evanmac · January 29, 2022, 8:26pm

Update:

eventually I used the Finder to copy “core” data (identity and certificates over all), now he, the finder, is copying around 2,5 TB of data (blobs), telling me that in two days will finish the copy.

Hopefully most part of blobs will be saved and my node will go up soon, again.

deathlessdd · January 29, 2022, 9:17pm

If a drive is dying and there’s data on the sectors that are dying you wont ever magically get the data off it it will forever be gone no matter what software you try to use to transfer the data.

Toyoo · January 29, 2022, 9:23pm

Depends on the mode of failure. In some cases you can recover data from failing sectors if you repeat your reads many times. ddrescue is helpful in these scenarios, already had several cases in which it was useful.

deathlessdd · January 29, 2022, 9:26pm

Ive never had any luck recovering data from a dying drive theres always been where the drive just times out while trying to copy no matter what I tried. And Ive tried every single software under the moon, The drive was just to far gone. Theres no guarantee of getting all the data back off even with rescue software…

evanmac · January 30, 2022, 11:18am

So, the node is up and running; obviously suspended.

I had another problem I didn’t have before “the crash”: the web interface says that QUIC is misconfigured, I setup the router to forward both TCP and UDP to port 28967, and put this line into my start script:

-p 28967:28967/tcp \
-p 28967:28967/udp \

But it seems not working, what I’m doing wrong?

SGC · January 30, 2022, 11:34am

think you need to close the webdashboard and reload it…
maybe it was the node…
you can find the conclusion of how to fix it at the end of this thread.

evanmac · January 30, 2022, 12:27pm

Thank for your reply, the solution was to stop the node, delete the container, and start over it.

It worked!

Now I only have to wait that my node will be de-suspended again

evanmac · January 30, 2022, 12:33pm

Just a quick note as for the data recover: it seems that macOS (monterey, the last release), as some routines that acts as data recover.

The Finder tried to copy data, but it seems that he scanned the data and put them in some cache, when found some unreadable blocks/data asked me to stop the copy or finalize it, I choose finalize and then started copying to new disk; this way I recovered almost all the blobs!

SGC · January 30, 2022, 2:09pm

just takes a long time but if one is careful and doesn’t write new / change existing data on the disk.
then often i find one can get data back… its just often damage in many or some of the files itself. so one never really gets a flawless data set back…

Good lucky will be interesting to see how it goes long term.

deathlessdd · January 30, 2022, 2:10pm

I hope it got enough data im guessing you will find out sooner rather then later at least if your missing too much data.