Technically it currently is 29/52/80/110 instead of 29/35/80/110
29 is bad for performance reasons. It does impact the piece size. Storage nodes are currently forced to store pieces with an odd size that doesn’t naturally fit hard drive sector size. We are going to change that to 16 or 32 instead. That should give us a better performance on the storage node side.
In terms of repair traffic, we noticed that we are in a good spot. We haven’t lost a single file. The first set of numbers was selected with a conservative intend in mind. For the next iteration, we are looking for numbers that would theoretically reduce the durability (from 100% to 99.9999…) but that should give us some advantages in terms of storage expansion factor, storage node payouts, performance. We just need to be careful to not get too aggressive. Ideally we keep the durability very close to 100% and get all the other benefits at the same time.
We are still working on that. We have a few candidates that we want to test on saltlake or europe north first. We want to test out how expensive the repair traffic would be, the durability of the file with a large enough set of data and last but not least upload and download performance.
Thank you. Very informative. One follow-up question…
Does this mean an upload is split into 29 pieces, 80 piece copies are stored on storage nodes, if the number pieces on the network drops to 52 then a repair process is run, 110 is the maximum number of pieces if storage nodes come back online?
edit - reading this back I probably just need to read more about erasure coding.
hi @Stob this is a great thread. Thanks for starting it. We definitely spend a lot of time looking at all different kinds of systems and practices , because we’re always seeking to improve. So our approach changes over time. As @littleskunk points out.
If youre interested in hearing our CTO talk about Reed Solomon, we touch on it briefly in this podcast episode below. There are also some good posts on our blog that are easily digestible
You are very close. At the end of an upload 80 pieces exist. Any 29 of them can be used to restore the file. So we are not talking about copies. All 80 pieces are different. Everything else as you said. We repair at 52. The 110 is for cutting of the slow nodes.