Is there any HyperLogLog framework for multiple multisets?

HyperLogLog evaluates the power of a multiset. Is it possible to expand it to handle several multisets? For example, instead of just supporting cardinality () request evaluation, it will support cardinality (multiset_id) evaluation. I am trying to avoid using a dictionary of HyperLogLog values ​​for each multiset_id file.

Is there any other way (data structure) to achieve this?

+4
source share
1 answer

The following idea can help when you have a fairly large number of multisets with high dispersion in their capacities; that is, some are large and some are small. This does not require a preliminary assessment, which will be small and large.

You can build a linear probabilistic counter with a little change. The original data structure has a logical logical value at each position. Here, each position in itself would be a classic set. Instead of setting the bit to

insert(element) 

op, if it falls into this position, you must insert idinto the set on

insert(element, id)

, , . , , id , bin, .

, , , :

  • ( )

  • (, , )

- , .

YMMV.

+2

All Articles