I understand what makes color filters an attractive data structure; however, it’s hard for me to understand when you can use them, because you still have to perform an expensive operation that you are trying to avoid in order to be sure that you have not found a false positive. Because of this, they will not add much overhead at all? For example, a wikipedia article for color filters suggests that they can be used to synchronize data. I see how great it would be the first time the flowering filter is empty, but I’ll say that you haven’t changed anything and you’ll sync your data again. Now every search for a flower filter will indicate that the file has already been copied, but shouldn't we have to reform the slow search task that we tried to avoid to really make sure that it is correct?
source
share