What is the purpose of SizeHint in Iterator :: unzip?

From the Rust unzip standard library:

 fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB) where FromA: Default + Extend<A>, FromB: Default + Extend<B>, Self: Sized + Iterator<Item=(A, B)>, { struct SizeHint<A>(usize, Option<usize>, marker::PhantomData<A>); impl<A> Iterator for SizeHint<A> { type Item = A; fn next(&mut self) -> Option<A> { None } fn size_hint(&self) -> (usize, Option<usize>) { (self.0, self.1) } } let (lo, hi) = self.size_hint(); let mut ts: FromA = Default::default(); let mut us: FromB = Default::default(); ts.extend(SizeHint(lo, hi, marker::PhantomData)); us.extend(SizeHint(lo, hi, marker::PhantomData)); for (t, u) in self { ts.extend(Some(t)); us.extend(Some(u)); } (ts, us) } 

These two lines:

 ts.extend(SizeHint(lo, hi, marker::PhantomData)); us.extend(SizeHint(lo, hi, marker::PhantomData)); 

don't actually extend ts or us anything, as the next SizeHint method returns None . What is the purpose of this?

+7
rust
source share
2 answers

This is a cool trick. Providing a hint of this size, it gives ts and us ability to reserve space for extend calls in a loop. According to the documentation

size_hint() is primarily intended to be used for optimization, such as reserving space for iterator elements, but should not be trusted, for example, omitting border checks in unsafe code. An incorrect implementation of size_hint() should not lead to memory security violations.

Note that creating SizeHint necessary because the extend loop for the call is done using the Some value ( Optional implements the Iterator trait), and size_hint for a Some to (1, Some(1)) . This does not help with pre-placement.

But looking at the Vec code, this will have no effect (neither in HashMap and VecDeque ). Other implementations of extend may vary.

Executing ts.extend(SizeHint(lo, hi, marker::PhantomData)); does not call resize since next returns None . Maybe someone should write a patch.

 impl<T> Vec<T> { fn extend_desugared<I: Iterator<Item = T>>(&mut self, mut iterator: I) { // This function should be the moral equivalent of: // // for item in iterator { // self.push(item); // } while let Some(element) = iterator.next() { let len = self.len(); if len == self.capacity() { let (lower, _) = iterator.size_hint(); self.reserve(lower.saturating_add(1)); } unsafe { ptr::write(self.get_unchecked_mut(len), element); // NB can't overflow since we would have had to alloc the address space self.set_len(len + 1); } } } } 
+5
source share

This is a dubious hack!

It implements an iterator with a fake (oversized) size in order to induce the collection in production to ultimately reserve a suitable container in front.

A cool trick, but it does this by implementing a size hint where the estimated lower bound is greater than the actual number of elements produced (0). If the lower bound is not known, the iterator should return the lower bound of 0. This implementation may be very buggy for this reason, and the Extend impl collection may respond with an error as a result (but, of course, not without memory).

+4
source share

All Articles