What is if a big Disk fail?

19topun93 · April 4, 2024, 8:16pm

Hello there,

I’m really new to Storj. I have a 24/7 Server as a Hobby and Storage server. Furthermore, I have a LXC Container with one disk of 1 TB and a LXC Container with a disk of 16 TB data available.

I have read here in the forum, but I think not an answer to my Question to this.

What if my 16 TB fails. Then I would replace with another 16 TB disk, mount it at the same spot, start the Docker Container and it is all good? Or is there a timepanelty for the rebuild time? With Download speed of 250 mbit/s and if the disk would be full, it would a longer time to rebuild the System. Because, if there is a big amount of TB lost, it could be useful to run a RAID in the end. Or guess I wrong?

Or what is the strategy, if a Disk died.

I have already read, that my IP is like one node. So more nodes at home are not more nodes at Storj right?

pangolin · April 4, 2024, 8:24pm

There is no such thing like a rebuild for nodes. You can only start a new node from zero.

19topun93 · April 4, 2024, 8:36pm

Thanks for the fast answer.

But, if I have 16 TB of Storage, it would have a duration of 2 years to recover from this, because the Data has to go on my storage again. If this Storj earnings estimator - Google Sheets is the base.

aardvarkl · April 4, 2024, 8:39pm

Then add another disk as a mirror.
Of course thats just dead disk space (and cost) until you get a failure

daki82 · April 4, 2024, 8:48pm

Redundancy is in the network. After 4h offline, the urgent pieces are rebuild usualy 1-4% of the data (you lose it after going back online).
(held amount covers the repair costs, the rebuild goes to other nodes)
there is enough time (12 days) to fix minor things (fans/psu/etc.) before suspension or disqualification (offline too long, audits etc.)

You are financialy better off running a seperate node instead of hot-spares or parity/mirror raids.
(bathtub curve)

Databases and logs
(they are non-vital, exept orders.db.) may go to an other ssd
drive for saving iops.
(even usb sticks are possible, if durable and right formated)

You generate a new identity (its bound to the data of the node and useles without each other), and use another drive.
It should run on what you already have. (buy nothing recommended)
There is no way to financialy effectively backup a node.
filling 16TB takes long time. (its realy not predictable, depends on usage and customer behavior)

Usualy you start with a drive, wait for it to fill, then start the next node with another drive or migrate it to a bigger drive or just keep it as it is.

(I started with 12TB, now filling 20TB) patience is the key.
(usualy 1TB is not worth the energy cost)

pangolin · April 4, 2024, 8:59pm

As far as I know disqualification comes after 30 days offline. One of my nodes was 28 days offline and survived.

19topun93 · April 4, 2024, 9:00pm

Thanks for your answer. The 1 TB was to try, and the 16 TB I have from another Project I didn’t need at the moment. So I don’t buy it for Storj.

How would I add space? Copy to a new Disk, and mount it? What is, if I replace the Disk and have the same Ident. of my node. Is this nothing? That is not clear for me. Because, if the other nodes knows what Data I should have, Why not Rebuild this with full rate until it is restore?

daki82 · April 4, 2024, 9:00pm

Right, also normal CMR drives do not bear the load of more than one node. At least not for long (also its the Law. 1node per drive/raid)

19topun93 · April 4, 2024, 9:01pm

Yeah. Beacuse of this, I have one LXC Container with One core for each drive. (I run Proxmox)

Roxor · April 4, 2024, 9:08pm

Yes you can move your node to a larger disk (or otherwise expand the filesystem beneath it). If you lose a disk you can’t restore just the Identity and have the network rebuild what was lost from your old drive: because they’ve already paid other nodes to hold your old data (from your holdback funds). So the network does rebuild it: just not on your system

Basically if you show your node was unreliable and lose the data they sent you… your only option is to start again with a fresh Identity. Yes it means if it took you 2 years to fill 16TB it will take about the same time to fill it again.

pangolin · April 4, 2024, 9:23pm

Storj doesn’t know (and doesn’t care) about your disk health. They only can recognize failed audit or download request for certain files.

Mitsos · April 4, 2024, 9:55pm

FYI: disks don’t just suddenly die (well, unless their firmware is crap, I won’t mention any brands in order to protect the guilty). They’ll start showing signs of failing LONG before they give up completely.

At the first signs of a failing disk, you stop the node and add your disk. You ddrescue the old disk to the new disk and you start it up on the new disk. ddrescue because it uses a few tricks to make sure that it can get whatever data it can (ie reading slower, reading backwards, skipping ahead of damaged sector clusters etc).

daki82 · April 5, 2024, 1:57am

There are ways to copy the 1TB Node to the 16TB drive.
(but never should 2 nodes run with the same identity.)
It depends, what you want to do with the one TB?

daki82 · April 5, 2024, 1:58am

You simply can’t predict that.

daki82 · April 5, 2024, 2:10am

Because the rebuild happens this way:
If pieces are under a certain amount, repair workers request 29 of them, rebuild the file, generate missing pieces, and distribute them to other nodes.
It is decided for every piece on the drive individualy. Some may just go missing.(enough pieces in the network)

In short: Stop the node, copy data and identity to the new disk, and run from there.
(please use the ways described in the forum)
(if the data and identity from the old place come online ever again, it can be disqualified.)
If you want to use the 1tb somewhere else or decomission it.

OR:

Just start a new node,(new identity) like also described here.

19topun93 · April 5, 2024, 10:56am

Thank you very much for all your answers.

I have two Disk in the Container. 1: 6 GB SSD storage and 2: my 16 TB. So if the Disk is predicted to die, I could transfer it to a new Disk and could run the docker Container command again so I think
If it dies with no transfer to the disk, I have to start from new on.

Thank you for help and understanding, how Storj works. And now I will see how it goes But. It would be very cool to have a return of the cost for my electricity cost

Last question. Is there a “point system” for Storage that is on a better internet connection? I have now 250 MBit/s down. 100 Mbit/s up. But I could have 1 Gbit/s down and 500 Mbit/s up.

Or is it only the recommend stats with 100 Mbit/s up and down that is useful?

Roxor · April 5, 2024, 11:20am

Storj isn’t keeping performance stats about the quality of your connection: but every upload/download is a “race” with other nodes. Clients are connecting to many at once: and once they’ve sent/received what they need to they cancel all other connections. So nodes with lower latency storage and faster Internet will naturally win a larger percent of those races.

So speed can help you fill your disk faster - and service more download requests: both result in getting paid a bit more.

19topun93 · April 5, 2024, 11:57am

Thank you very Much

snorkel · April 5, 2024, 1:05pm

Bandwidth is not so important; you could be OK even with 10mbps. The latency is important from your node to clients. Don’t use wifi to connect your node.
Use the cheapest option as internet subscription.
You will need a fix IP, so either ask your ISP to assign you a fix WAN IP, or use a DDNS like No-IP.
To transfer storj data from one drive to another takes a verry long time. Cloning is the fastest, if you can still do it. I remember transfering a win node to a Synology with robocopy. It took like 1,5 days per TB.
So don’t get your hopes up for saving the 16TB one when will fail. Maybe you can, but maybe not. I won’t even waste my time with it if cloning is not an option. I also have 16TB drives.

pangolin · April 5, 2024, 1:40pm

My latest node transfer (QNAP NAS ZFS to Windows Server NTFS) took like 28 days for only 6 TB. HDD on both sides was a single Toshiba Enterprise which is one of the fastest HDDs available.

So I guess transfering 16 TB with robocopy or rsync is almost impossible these days.