How to check (non-trivial) equivalence of lists of numbers quickly?

I have a list of integers, for example 1,2,2,3,4,1. I need to be able to check equivalence (==) between different lists.

However, I do not mean a simple quantitative comparison. Each of these lists actually denotes a given section, where the position in the list denotes the index of the element, and the number denotes the index of the group. For example, in the first element 0 and element 5 are in the same group, elements 1 and 2 are in the same group, and elements 3 and 4 are in their own separate groups. The actual group index is not important, but only the grouping.

I need to be able to check equivalence in this sense, therefore, for example, the previous list would be equivalent 5,3,3,2,9,5,, since they have the same grouping.

The way I do this reduces the array to some normal form. I find all the numbers that have the same value as the first number, and set them all to 0. Then I continue on the list until I find a new number, find all the numbers of the same value, and they will all be equal to 1. I continue in this way.

In my example, both numbers decrease to decrease to 0,1,1,2,3,0, and of course, I can just use a simple comparison to make sure they are equivalent.

However, this is rather slow, since I have to make a few linear passes over the list. So, to abort the chase, is there a more efficient way to reduce these numbers to this normal form?

Be that as it may, more generally, I can avoid this reduction together and compare arrays in a different and possibly more efficient way

Implementation details

  • These arrays are actually implemented as bits to save space, so I really need to iterate over the entire list every time that rb_tree esque hashing does not occur.
  • Large numbers of these arrays will be stored in stl unordered_set, therefore it is necessary to consider the hash requirement.
+5
4

, ( std::map, ) . , , , - ( ). :

1,2,2,3,4,1
5,3,3,2,9,5

1- > 5, 2- > 3, 3- > 2 4- > 9, . - :

5,3,3,2,9,5
1,2,2,3,2,1

5- > 1, 3- > 2, 2- > 3, 9- > 2 , 2 ; , , .

- , , , , . , , , .

+18

K N ( ) O(N), - O(N log K), .

, :

std::unordered_map<std::size_t,std::size_t> map;

std::vector<std::size_t> signature;
signature.reserve(array.size());

for (std::size_t i: array) {
  // insert only inserts if they key is not already present
  // it returns std::pair<iterator,bool> with iterator pointing
  // to the pair {key: i, value: index}
  size_t index = map.insert({i, map.size()}).first->second;
  signature.push_back(index);
}

- .

, .

+2

-, . , -, , , . , .

: [1 2 2 3 4 1] has hash to 162345. (... , ).

, , . ,

[1 2 2 3 4 1] → 1622324151 ( , )

[5 5 5 9 9 9] → 12334563

[1 2 3 4 5 6] → 112131415161

, , .

+1
source

If you know the maximum possible “group”, then you can do something like this (psuedocode, but you should understand this :)

for i = 0; i < listLength; i++
    if !mapping[list1[i]]
        mapping[list1[i]] = list2[i]
    if mapping[list1[i]] != list2[i]
        return false;
return true;
0
source

All Articles