Future changes to Reed Solomon numbers

Stob · October 28, 2020, 12:15pm

In the most recent October 2020 Development Update there was a point mentioning a future change to the RS numbers. I was interested to know:

is RS the same as erasure coding?
is the current RS still 29/80? link
what has been learned now the current network has been live for over a year?
the expected change to the RS numbers?

littleskunk · October 28, 2020, 3:13pm

Yes

Technically it currently is 29/52/80/110 instead of 29/35/80/110

29 is bad for performance reasons. It does impact the piece size. Storage nodes are currently forced to store pieces with an odd size that doesn’t naturally fit hard drive sector size. We are going to change that to 16 or 32 instead. That should give us a better performance on the storage node side.

In terms of repair traffic, we noticed that we are in a good spot. We haven’t lost a single file. The first set of numbers was selected with a conservative intend in mind. For the next iteration, we are looking for numbers that would theoretically reduce the durability (from 100% to 99.9999…) but that should give us some advantages in terms of storage expansion factor, storage node payouts, performance. We just need to be careful to not get too aggressive. Ideally we keep the durability very close to 100% and get all the other benefits at the same time.

We are still working on that. We have a few candidates that we want to test on saltlake or europe north first. We want to test out how expensive the repair traffic would be, the durability of the file with a large enough set of data and last but not least upload and download performance.

Stob · October 28, 2020, 3:38pm

Thank you. Very informative. One follow-up question…

Does this mean an upload is split into 29 pieces, 80 piece copies are stored on storage nodes, if the number pieces on the network drops to 52 then a repair process is run, 110 is the maximum number of pieces if storage nodes come back online?

edit - reading this back I probably just need to read more about erasure coding.

jocelyn · October 28, 2020, 3:54pm

hi @Stob this is a great thread. Thanks for starting it. We definitely spend a lot of time looking at all different kinds of systems and practices , because we’re always seeking to improve. So our approach changes over time. As @littleskunk points out.

If youre interested in hearing our CTO talk about Reed Solomon, we touch on it briefly in this podcast episode below. There are also some good posts on our blog that are easily digestible

jocelyn · October 28, 2020, 4:01pm

Im curious to hear about how other people in the community deal with weighing tradeoffs in dealing with technical issues and improvements. Maybe I should spin that off as a thread

littleskunk · October 28, 2020, 4:36pm

You are very close. At the end of an upload 80 pieces exist. Any 29 of them can be used to restore the file. So we are not talking about copies. All 80 pieces are different. Everything else as you said. We repair at 52. The 110 is for cutting of the slow nodes.

Alexey · October 29, 2020, 6:15pm

A post was merged into an existing topic: [Testers Needed] Filezilla Onboarding Page