When is a Bloom filter useful?

I understand what makes color filters an attractive data structure; however, it’s hard for me to understand when you can use them, because you still have to perform an expensive operation that you are trying to avoid in order to be sure that you have not found a false positive. Because of this, they will not add much overhead at all? For example, a wikipedia article for color filters suggests that they can be used to synchronize data. I see how great it would be the first time the flowering filter is empty, but I’ll say that you haven’t changed anything and you’ll sync your data again. Now every search for a flower filter will indicate that the file has already been copied, but shouldn't we have to reform the slow search task that we tried to avoid to really make sure that it is correct?

+5
source share
2 answers

Basically, you use Bloom filters to avoid the long and difficult task of proving that an element does not exist in the data structure. It is almost always difficult to determine if something is missing than if it exists, so the filter helps to keep losses by looking for something that you will not find in any case. This does not always work, but when you do this, you get huge benefits.

+5
source

Bloom filters are very effective in the case of membership requests, that is, to determine whether an item belongs to a collection. The number of elements in the set does not affect query performance.

0
source

All Articles