Data structure for storing unique elements

I'm looking for a data structure that should preferably perform equal O (1)? for any number of elements when adding / removing / extracting elements.

Here are some additional guidelines.

  • Retrieving items should not include slow keys() Items
  • must always be unique and specific
  • the order of the elements is not significant
  • Adding or removing an element should not include iterating over other elements.
  • spaces in the extracted list of elements are valid and can be represented with undef value

Please suggest a better solution than

 sub uniqArrayFactory { my $members = []; my $seen = {}; my $gaps = []; return sub { my (%arg) = @_; return $members if $arg{members}; my $m; if (defined ($m = $arg{del})) { return if !$seen->{$m}; ${ $seen->{$m} } = undef; push @$gaps, delete($seen->{$m}); } elsif (defined ($m = $arg{add})) { return if $seen->{$m}; if (@$gaps) { $seen->{$m} = pop @$gaps; ${ $seen->{$m} } = $m; } else { push @$members, $m; $seen->{$m} = \( $members->[-1] ); } } return $m; }; } 

UPDATE (use)

 my $fa = uniqArrayFactory(); $fa->(add => 10); $fa->(del => 10); my $members = $fa->(mebers => 1); 
+6
source share
2 answers

keys and each surprisingly slow. But if you store each element as a hash value and use values , things get faster. FROM

 use strict; use warnings; use Benchmark qw(:all); my $i; my $fa; my %hash; my %compare = ( uarray => sub { $fa->(add => $i++); my $memb = $fa->(members => 1); for my $v (@$memb) { next if !defined $v; } }, hash => sub { $hash{ $i } = $i; for my $v (values %hash) {} $i++; }, ); $i = 0; $fa = uniqArrayFactory(); %hash = (); cmpthese(10000, \%compare); sub uniqArrayFactory { my $members = []; my $seen = {}; my $gaps = []; return sub { my (%arg) = @_; return $members if exists $arg{members}; my $m; if (defined ($m = $arg{del})) { return if !$seen->{$m}; ${ $seen->{$m} } = undef; push @$gaps, delete($seen->{$m}); } elsif (defined ($m = $arg{add})) { return if $seen->{$m}; if (@$gaps) { $seen->{$m} = pop @$gaps; ${ $seen->{$m} } = $m; } else { push @$members, $m; $seen->{$m} = \( $members->[-1] ); } } return $m; }; } 

I get:

  Rate hash uarray hash 3205/s -- -6% uarray 3401/s 6% -- 
+2
source

Ironically, perhaps Tie::IxHash , which was motivated by the desire to get the hash keys in that order, is as close as you are going to get to what you want.

In the Tie::IxHash keys and values ​​are stored in array references. keys returns a copy of the key set, but something like (tied %hash)->[1] will give you direct access to it.

Removing elements in Tie::IxHash is O (n). A possible workaround for this would be to replace the undef values ​​rather than deleting them. That is, preferring

 $ixhash{$obsolete_key} = undef; 

to

 delete $ixhash{$obsolete_key}; 

Or, if you could combine your deletions - if you could organize your code so that you would usually call delete several keys at the same time and between other hash operations - then there is room for improvement on Tie::IxHash .

+1
source

All Articles