What is the time complexity of the Array # uniq method in ruby?

Can someone tell me which algorithm internally uses ruby ​​to remove duplicates from the ruby ​​array using the Array#uniq method?

+7
source share
5 answers

From docs :

 static VALUE rb_ary_uniq(VALUE ary) { VALUE hash, uniq, v; long i; if (RARRAY_LEN(ary) <= 1) return rb_ary_dup(ary); if (rb_block_given_p()) { hash = ary_make_hash_by(ary); uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash)); st_foreach(RHASH_TBL(hash), push_value, uniq); } else { hash = ary_make_hash(ary); uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash)); for (i=0; i<RARRAY_LEN(ary); i++) { st_data_t vv = (st_data_t)(v = rb_ary_elt(ary, i)); if (st_delete(RHASH_TBL(hash), &vv, 0)) { rb_ary_push(uniq, v); } } } ary_recycle_hash(hash); return uniq; 

It has O(N) complexity

+5
source

Amortized O (n) as it uses hash inside .

+3
source

It depends on what "internal" you say. There are 7 Ruby-ready implementations in current use, and the Ruby language specification does not prescribe any particular algorithm. So it really depends on the implementation.

For example, this Rubinius implementation uses :

 Rubinius.check_frozen if block_given? im = Rubinius::IdentityMap.from(self, &block) else im = Rubinius::IdentityMap.from(self) end return if im.size == size array = im.to_array @tuple = array.tuple @start = array.start @total = array.total self 

And this is one from JRuby :

 RubyHash hash = makeHash(); if (realLength == hash.size()) return makeShared(); RubyArray result = new RubyArray(context.runtime, getMetaClass(), hash.size()); int j = 0; try { for (int i = 0; i < realLength; i++) { IRubyObject v = elt(i); if (hash.fastDelete(v)) result.values[j++] = v; } } catch (ArrayIndexOutOfBoundsException aioob) { concurrentModification(); } result.realLength = j; return result; 
+3
source

It compares elements using their hash (provided by the Object#hash method) then compares hashes with Object#eql?.

Source: https://github.com/ruby/ruby/blob/trunk/array.c#L3976

+1
source

The complexity of time is linear time, that is, O (n), because it uses Hash for internal implementation of the algorithm.

+1
source

All Articles