How bloom filters work

Pac · March 2, 2021, 8:19am

I see, I guess I don’t get how bloom filters work ^^’
Alright, then I guess it’s not such of a problem

BrightSilence · March 2, 2021, 8:29am

The bloom filter is created to always match all pieces that should be on the node. The node then checks which pieces don’t match the bloom filter and removes them. (though for reasons of efficiency, the size of the bloom filter is reduced by making it match about 10% of what’s not supposed to be on the node as well. So with every run about 10% of garbage stays behind to be cleaned up the next time)

Pac · March 2, 2021, 9:13am

Does this mean it takes 10 bloom filter “passes” for removing all orphan pieces?

Hmm, that doesn’t add up… I guess it’s more like “the size of the bloom filter is reduced by making it match about 90% of what’s not supposed to be on the node”

As a ide note: Does this rely on databases? If you lost all databases (that happened to me once, it doesn’t kill the node but stats are lost for current month): Would bloom filters still work?

BrightSilence · March 2, 2021, 9:17am

It doesn’t rely on databases. The bloom filter tells the node what not to delete. So if 10% of garbage matches the bloom filter, the remaining 90% gets deleted.

Toyoo · March 2, 2021, 9:34am

It is likely that false positives (ie. garbage that is not removed) will be the same, or very similar, across runs, so no, unfortunately not.

Pac · March 2, 2021, 9:36am

I’m having a hard time to understand all that. Besides, if it doesn’t rely on databases, it would mean the node would have to browse all pieces to find those that should be deleted, which would make crazy amounts of IO. But that’s not what’s happening…

I’m confused

But let’s get back on the topic: it does answer my initial question about what happens to “foreign” pieces that could have been added intentionally among STORJ pieces: they get would get eventually deleted by the garbage collector! That’s good enough for me
Cheers.

Toyoo · March 2, 2021, 11:11am

Some simple explanation of Bloom filters, maybe it will help. Note that the following is not an exact description of Bloom filters as implemented in Storj, rather a generic description of Bloom filters with some guesses as to how it applies to the garbage collection process.

Imagine you have data about English words. Each file is an entry about one word, its definition, example usages, etymology, etc. My local dictionary lists 102401 words like yodeling, lofts, particularly, fatigue, standing, contour etc., and a single node may keep data about tens of words at a time, which is already a long list. Imagine calling a SNO over a phone and telling them 30 words.

Instead, you call your SNO and tell that they are supposed to have data only about words whose names use letters “abcdeghiklmnoprstuvy”. So now you need to tell them up to 26 letters, regardless of the number of words the SNO is expected to store! This is your Bloom filter. yodeling is fine, standing is also fine—these are the words you expect them to have. But if, let say, the SNO had a word fatigue, as it has the letter f, the SNO can remove it, because you no longer expect that SNO to have any words with f. On the other side, your SNO will think they need to store words particularly and contour. Last Friday you tried to call them and tell you want them removed, but their phone was turned off.

Now, after a week, SNO is requested to add three new words and remove two old ones. The list of letters is now “abcdeghiklmopqrstuvy”. contour no longer qualifies (the letter n disappeared), but particularly still fits. As the list of letters didn’t change much, the false positives will likely be mostly the same.

Bloom filters are like that, except instead of letters in a word they use bins formed from cryptographic hashes, so you can make more bins (not just 26 letters; more like thousands or even millions of different letters—imagine Chinese characters ;-)). The more bins, the harder to make mistakes, but also more data to send every time, so it’s a trade-off. Storj chose the trade-off to be so that the rate of mistakes is 10%. Besides, your node is expected to stay online most of the time, so if there’s garbage, it’s only because the node was offline at the time when the satellite tried to tell the node to remove them.

What is specifically relevant to this thread is that if SNO missed the request for deletion of contour, this word will only get garbage-collected when the filter loses at least one of its letters. So if SNO were to be expected to keep tour and constraint for the next ten years, they would never learn that contour was to be removed.

There are some tricks to quickly compute which bins are required, and the node can store some information on which bins correspond to which files, so that removal is fast. There’s also the trick of using multiple hashes instead of a single one. But I don’t know enough details of Storj’s implementation of the Bloom filters, so can’t comment on that.

Doom4535 · March 2, 2021, 12:35pm

This is a great explanation of bloom filters; if we ever get a StorJ community wiki, this would be a good addition. I appreciate how you broke out the whole example as a demonstration.

BrightSilence · March 2, 2021, 5:59pm

Not true. The process uses a different seed each time specifically to prevent this.

That’s exactly what happens. But it only needs to check against filenames. It depends on the file system and hardware, but on my slowest node hardware this process can take hours. On the fastest which is ssd accelerated, it takes a few minutes at most. If you turn on debug logging you can see the retain process running and going through files which are moved to trash.

Pac · March 2, 2021, 7:25pm

That was great @Toyoo, thanks a bunch!

Are we 100% sure that’s how it works? I do have a lot of garbage (2 to 15 GB, kinda constantly) on my nodes although they are online as much as possible (99.85+%). Even the one on my 7200 RPM CMR drive has some data in the trash (~3GB currently).

How often do satellites send bloom filters? I mean one of my nodes takes ages (almost 30 hours) to browse all files whenever it updates. It surprises me I did not notice it IOing like crazy regularly because of the garbage collection. Besides, that sounds like a terrible thing to do to our disks! Updates are already quite heavy on them, but if the GC does this regularly as well, they are goners…

BrightSilence · March 2, 2021, 7:56pm

In a perfect world, that’s how it works. In my experience slower nodes may accept pieces just a little too late, which can cause completely finished transfers to end up in garbage collection. It’s a fallback mechanism, which means it’s kind of hard to determine where data that was left behind came from.

I don’t know exactly. At least once a week, maybe more often. But since it only has to read file names, it can be a lot faster than the file walker when you start the node. I only notice it taking any significant time on my node which runs on a Drobo, connected over USB2 using NTFS on a linux host. That’s kind of a worst case scenario. But even though all of that is a bit of a mess, I think it’s the use of NTFS that has the biggest impact. As NTFS doesn’t have all file metadata in one place, but rather stored along with the files. It requires a lot more effort to compare filenames than other file systems which store that stuff neatly together.

Pac · March 2, 2021, 8:06pm

Ah right, it’s not every 12 hours or so… alright.

What else does a node do when starting up then? I thought it simply had to browse file names & folder structures. Does it actually read all files’ content?

BrightSilence · March 2, 2021, 8:08pm

Don’t know exactly, but it at least deals with file sizes as well. We’re getting a little (extremely) off topic though. And I think the main subject here is important. So better stop the distractions.

nerdatwork · March 2, 2021, 8:22pm

It’s sent every 5 days

Toyoo · March 2, 2021, 9:37pm

Oh, good to know! I thought this approach would be too expensive for nodes and satellites, as it essentially means you can’t update bloom filters computed last time, but if it is not, great!

BrightSilence · March 3, 2021, 8:17am

Thanks @Alexey for splitting the topic!

And thanks for that addition. I wasn’t sure exactly and too lazy to look it up, but that sounds exactly like what I’m seeing.

snorkel · January 27, 2024, 11:51pm

So… where exactely these bloom filers are stored on nodes, when they are received, if not in a db? I imagine it’s a list with text? So it’s a text file?
Second: why do we need to move pieces to trash? Can’t they just be marked by GC for deletion after 7 days, in the matadata, in inodes, etc. somewhere in the filesystem, with an expiration date and time, without moving the files?

jammerdan · January 28, 2024, 4:09am

github.com/storj/storj

GC bloom filter should be stored on disk in case of node restart

opened 08:11AM - 25 Jan 24 UTC

mniewrzal

Currently lazy file walker will always start from the beginning on node start (s…ee #6708). Fixing used space calculation is one thing but we should also improve GC process. At the moment node is receiving bloom filter and its processing all satellite pieces against this bloom filter. The issue is that GC takes a lot of time and with node restart (updates, issues) we are loosing bloom filter and even if we start file walker from last position we won't have data to finish GC process. Acceptance Criteria: * store bloom filter on disk to be able to restart GC process from last position * bloom filter should be removed after successful GC

It is probably because Storj does not want to pay SNOs for the deleted data.

Alexey · January 28, 2024, 5:06am

It’s not the goal.
The pieces moved to the trash to have an ability to recover them if the bloom filter was wrong and deleted too many pieces, or other similar issues, which can potentially affect customers.

jammerdan · January 28, 2024, 5:12am

But it is free storage for Storj.