I needed an excuse to restart my personal blog, so I’ve published what I hope is a useful write-up on the impact of bloom filters on Storj’s system. It’s primarily aimed at software architects rather than regular node operators though. Any feedback is welcome!
Excellent blog post. Its very concise and explains BFs in simple words. My only meh moment was that the page is not “big screen” friendly. I have to scroll when the text could be “justified” It also doesn’t help that my mouse’s wheel has been acting up.
Thank you for your contribution and time for this post
Yeah, went with default style to just get something done, not a fan myself… Will need to figure out something better.
Your post might get linked to other articles and referenced by fellow SNOs while explaining things. So I would recommend fixing it soon and monetize the page “if” you want.
Ouh wow! You linked my post there too! I’m flattered!
Nice reading at bedtime later.
Nice post.
One addition that might be of interest. The filter functions of the bloom filter are randomized, so even though the false positive rate is 10%, across multiple GC passes the amount of garbage is actually lower than 10%. You can do a simulation to see the impact of multiple GC passes under various conditions: bloom filter simulation · GitHub
Yep! This was already mentioned in the Probabilistic Cleanup section:
If an obsolete piece is not collected during one cycle, it is highly likely to be collected in the next.