Storage node performance improvement ideas

BrightSilence · July 8, 2019, 10:30pm

You’re saying contradictory things. If the risk to the SNO is doubled and lets say 25% use this option, that means over the whole network the churn rate of nodes goes up by 25%. This leads to 2 things. 1. 25% more repair traffic required, which costs money. And a higher RS ratio is needed to guarantee the same data reliability. Higher RS ratio means more node storage required per stored GB, which means storing data would be more expensive. A lower reliability of all or a substantial amount of individual nodes directly impacts the cost of the network as a whole if you want to keep the same reliability network wide.

Storgeez · July 8, 2019, 10:52pm

Only thing I’ve said was to give additional mount point and that it would (logically) double the risk to the SNO to lose data from storage failure. There is nothing contradictory in that.

BrightSilence · July 8, 2019, 10:55pm

You also said it wouldn’t hurt the network. I refer back to my previous post to show how that is contradictory with the statement that it doubles the risk for individual nodes.

Storgeez · July 8, 2019, 11:44pm

I didn’t say that, Storj said that.

littleskunk · July 9, 2019, 2:38pm

Thank you for this list. I have created issues for that. As soon as they are implemented we can review the results.

Krey · July 12, 2019, 5:45pm

Moving db to ssd actualy lower risks if add it to mirror with hdd partition. Performance stay as ssd and reliability growed up.

Odmin · July 12, 2019, 5:48pm

I pay your attention to this request I think it very simple and iportant.

Storgeez · July 12, 2019, 8:42pm

You would have to use two SSDs together, cannot combined SSD and HDD, but it would fix the DB performance issue for those that can afford it.

Eioz · July 12, 2019, 9:28pm

Please add cache to your network so multiple tiers. Two 200TB array for exemple with ssd. Xor add storage tiers depending on io speed test running the identity certificate generation or post. (Iops test) then storage tiers. Storj love

Krey · July 12, 2019, 10:10pm

Depends of file system. Zfs works with disbalanced mirrors very good.

littleskunk · July 13, 2019, 3:15am

There is no need for putting the database on a SSD. The problem is the database and if we can minimize reads and write that will also help us operating storage nodes an slower hardware.

Why do we have a database at all? We are tracking a lot of garbage at the moment. Remove all that and lets see how far we get with that improvement.

littleskunk · July 13, 2019, 3:29am

@BlackDuck short update on your findings:
The cert table got droped from the storage nodes with the last release. We will do the same on the satellite side. It should be in the next satellite release.

The used serial table will get much smaller by storing the serials only for 1 hour. If the uplink sends the same serial number after one hour it will get rejected because the created date is too old. So we can drop the serial number early. The satellite will still accept the order even days later. This change should get into the next storage node release.

Bandwidth tracking is now in memory but the first query still needs too long. We are planning on a rollup table similar to the satellite database. I am not sure what we are going to do with the order archive. It will be moved to something else. It could be a textfile or a second sqlite file. That part is unclear at the moment. I don’t expect that we can finish this for the next storage node release.

Long term goal: Remove the entire pieceinfo database. If an upload comes in just write the data on disk inclusive all metadata. Store only expire dates in the database. We expect that a small fraction of the data will have an expire date. This change will take a bit more time. I don’t have any ETA and ofc plans can be changed any time.

Alexey · July 15, 2019, 9:45am

5 posts were split to a new topic: Dashboard uses CPU and RAM when it always running

stanfieldr · July 23, 2019, 4:46am

Not sure if we are using db transactions or not, but they are an excellent way to combat data corruption, and just wanted to mention it. Although some databases coverage differs (for example Postgres can rollback DDL statements, but MySQL cannot). SQLite does support transactions looks like.

BlackDuck · July 23, 2019, 10:00am

Yes, on SQLite Storj are using transactions.

Pac · December 10, 2019, 12:40pm

Hi there.

I’m wondering something with regards to storj node performances:
Does it perform better on a 64bit architecture?

I’m asking because in my case it currently runs on 32bit OS: Raspbian, on a RPi 4B.
I noticed it uses quite a lot of CPU whenever there’s network activity (see my related post here: Minimum CPU & Memory requirement for 1.0 Gbps unlimited Fiber Optic internet).

Even though this RPi 4B is quite powerful, I’m aware it’s still quite a weak system as expected
But I was wondering if switching to a 64bit would make it more “at ease” with regards to CPU power.

Any opinion/insight on this?
Cheers

Pavmer · December 16, 2019, 6:55pm

What about defragmentation for SN
A defragmentation tool in the background, without much load on the array, would facilitate faster read/write access

As an option, it can be Diskeeper in automatic mode (THIS is not an ADVERTISEMENT)

p.s. I’m a newbie, and gathered my node on my knee with Windows and 1TB HDD
I plan to further expand the volume through RAID6

Alexey · December 23, 2019, 8:17am

Defragmentation tool is integrated to the OS, it makes no sense to duplicate it in storagenode

jensamberg · December 27, 2019, 11:45am

Is it an Idea to deploy storj with a Docker composer file which included

storj itself
MariaDB database
Watchtower
As env variables also the acces key to the network and everting else is done automatically

Maybe this improve read/write of data and easier setup