(Re) Using std :: algorithms with non-standard containers

Question

(Re) Using std :: algorithms with non-standard containers

I have a container type "column":

struct MyColumnType { // Data: Each row represents a member of an object. vector<double> a; // All vectors are guaranteed to have always vector<string> b; // the same length. vector<int> c; void copy(int from_pos, int to_pos); // The column type provides an interface void swap(int pos_a, int pos_b); // for copying, swapping, ... void push_back(); // And for resizing the container. void pop_back(); void insert(int pos); void remove(int pos); // The interface can be extended/modified if required };

Application:

 // If table is a constructed container with elements stored // To acces the members of the object stored at the 4th position: table.a[4] = 4.0; table.b[4] = "4th"; table.c[4] = 4;

Question: How to create a standardized random access iterator (and, probably, the necessary proxy reference type) for this container type?

I want to use std::algorithms for random access iterators with my type, for example. sort (note: a user-defined functor, for example, lambda, will be provided to sort the comparison).

In particular, an iterator should be provided with an interface similar to

 struct { double& a; string& b; int& c; };

Note 0: Allowed C ++ 11 / C ++ 14.

Note 1: There is an old paper http://hci.iwr.uni-heidelberg.de/vigra/documents/DataAccessors.ps where a similar attempt is made. However, I was not able to get their approach to work with something. Requirements such as defaultConstructible are hard to satisfy with a proxy-type approach (why std::sort requires that default types be constructive and not replaceable, not understandable).

Note 2: I cannot do the following:

 struct MyType { double a; string b; int c; }; std::vector<MyType> v;

and then use std::algorithm .

Motivation:. The cache line is usually 64 bytes, i.e. 8 doubles. In this simple structure, if you iterate over doubles, you pollute the cache line with the line a int. In other cases, you can only get 1 double pass in the cache line. That is, you end up using the 1 / 8th memory bandwidth. If you need to repeat a pair of doubles, this simple solution improves the performance of your application by 6-7 times. And no, I cannot allow this.

Bonus: the answer should be as general as possible. Think of adding / removing fields to a container type, like adding / removing members to a structure. You do not want to change a lot of code every time you add a new member.

+8

c ++ iterator algorithm proxy

gnzlbg Jun 01 '13 at 15:18

source share

2 answers

If you are really concerned about performance and want to sort your container using std::sort , use overloading that allows you to create a custom comparison object:

 template <class RandomAccessIterator, class Compare> void sort (RandomAccessIterator first, RandomAccessIterator last, Compare comp);

.. and sort the array of indices in the container. Here's how:

In your container you will need the following members:

 struct MyColumnType { ... int size() const; // swaps columns void swap(int l, int r); // returns true if column l is less than column r bool less(int l, int r) const; ... };

Then define the following comparison object:

 struct MyColumnTypeLess { const MyColumnType* container; MyColumnTypeLess(const MyColumnType* container) : container(container) { } bool operator()(int l, int r) const { return container->less(l, r); } };

And use it to sort the array of indices:

 void sortMyColumnType(MyColumnType& container) { std::vector<int> indices; indices.reserve(container.size()); // fill with [0, n) for(int i = 0; i != container.size(); ++i) { indices.push_back(i); } // sort the indices std::sort(indices.begin(), indices.end(), MyColumnTypeLess(&container)); }

The less element in the container controls the sort order:

 bool MyColumnType::less(int l, int r) const { // sort first by a, then b, then c return a[l] != a[r] ? a[l] < a[r] : b[l] != b[r] ? b[l] < b[r] : c[l] < c[r]; }

A sorted array of indices can be used in subsequent algorithms - you can avoid copying the actual data until you need it.

All std algorithms that work with RandomAccessIterators have overloads that allow you to specify custom comparison objects, so they can also be used with this technique.

0

willj Jun 03 '13 at 21:46

source share

dyp · Accepted Answer · 2013-06-03 21:05

I think something like this might be standard. It uses some of the features of C ++ 11 to simplify the syntax, but can also be modified to match C ++ 03 AFAIK.

Tested and works with clang ++ 3.2

Prelude:

 #include <vector> #include <string> #include <utility> // for std::swap #include <iterator> using std::vector; using std::string; // didn't want to insert all those types as nested classes of MyColumnType namespace MyColumnType_iterator { struct all_copy; struct all_reference; struct all_iterator; } // just provided `begin` and `end` member functions struct MyColumnType { // Data: Each row represents a member of an object. vector<double> a; // All vectors are guaranteed to have always vector<string> b; // the same length. vector<int> c; void copy(int from_pos, int to_pos); // The column type provides an itface void swap(int pos_a, int pos_b); // for copying, swapping, ... void push_back(); // And for resizing the container. void pop_back(); void insert(int pos); void remove(int pos); // The interface can be extended/modified if required using iterator = MyColumnType_iterator::all_iterator; iterator begin(); iterator end(); };

Iteration classes: a value_type ( all_copy ), a reference type ( all_reference ) and iterator type ( all_iterator ). Iteration is performed by saving and updating three iterators (one for each vector ). However, I do not know if this is the most effective option.

How it works: std::iterator_traits defines several related types for an iterator: [Iterator.traits] / 1

iterator_traits<Iterator>::difference_type
iterator_traits<Iterator>::value_type
iterator_traits<Iterator>::iterator_category
defined as the type of iterator difference, value type, and iterator category, respectively. Also types
iterator_traits<Iterator>::reference
iterator_traits<Iterator>::pointer
iterators iterators must be defined as links, i.e. for an iterator object a, of the same type as type *a and a-> , respectively

Therefore, you can enter struct ( all_reference ), storing the three links as a reference . This type is the return value *a , where a is an iterator type (possibly const -qualified). There must be a different value_type , because some standard library algorithms, such as sort , may want to create a local variable that temporarily stores *a (by copying or moving to a local variable). In this case, all_copy provides this functionality.

You are not required to use it ( all_copy ) in your own loops, where this may affect performance.

 namespace MyColumnType_iterator { struct all_copy; struct all_reference { double& a; string& b; int& c; all_reference() = delete; // not required for std::sort, but stream output is simpler to write // with this all_reference(all_reference const&) = default; all_reference(double& pa, string& pb, int& pc) : a{pa} , b{pb} , c{pc} {} // MoveConstructible required for std::sort all_reference(all_reference&& other) = default; // MoveAssignable required for std::sort all_reference& operator= (all_reference&& other) { a = std::move(other.a); b = std::move(other.b); c = std::move(other.c); return *this; } // swappable required for std::sort friend void swap(all_reference p0, all_reference p1) { std::swap(p0.a, p1.a); std::swap(p0.b, p1.b); std::swap(p0.c, p1.c); } all_reference& operator= (all_copy const& p) = default; all_reference& operator= (all_copy&& p) = default; // strict total ordering required for std::sort friend bool operator< (all_reference const& lhs, all_reference const& rhs); friend bool operator< (all_reference const& lhs, all_copy const& rhs); friend bool operator< (all_copy const& lhs, all_reference const& rhs); }; struct all_copy { double a; string b; int c; all_copy(all_reference const& p) : a{pa} , b{pb} , c{pc} {} all_copy(all_reference&& p) : a{ std::move(pa) } , b{ std::move(pb) } , c{ std::move(pc) } {} };

There must be a comparison function for std::sort . For some reason, we must provide all three.

  bool operator< (all_reference const& lhs, all_reference const& rhs) { return lhs.c < rhs.c; } bool operator< (all_reference const& lhs, all_copy const& rhs) { return lhs.c < rhs.c; } bool operator< (all_copy const& lhs, all_reference const& rhs) { return lhs.c < rhs.c; }

Now the iterator class:

  struct all_iterator : public std::iterator < std::random_access_iterator_tag, all_copy > { //+ specific to implementation private: using ItA = std::vector<double>::iterator; using ItB = std::vector<std::string>::iterator; using ItC = std::vector<int>::iterator; ItA iA; ItB iB; ItC iC; public: all_iterator(ItA a, ItB b, ItC c) : iA(a) , iB(b) , iC(c) {} //- specific to implementation //+ for iterator_traits using reference = all_reference; using pointer = all_reference; //- for iterator_traits //+ iterator requirement [iterator.iterators]/1 all_iterator(all_iterator const&) = default; // CopyConstructible all_iterator& operator=(all_iterator const&) = default; // CopyAssignable ~all_iterator() = default; // Destructible void swap(all_iterator& other) // lvalues are swappable { std::swap(iA, other.iA); std::swap(iB, other.iB); std::swap(iC, other.iC); } //- iterator requirements [iterator.iterators]/1 //+ iterator requirement [iterator.iterators]/2 all_reference operator*() { return {*iA, *iB, *iC}; } all_iterator& operator++() { ++iA; ++iB; ++iC; return *this; } //- iterator requirement [iterator.iterators]/2 //+ input iterator requirements [input.iterators]/1 bool operator==(all_iterator const& other) const // EqualityComparable { return iA == other.iA; // should be sufficient (?) } //- input iterator requirements [input.iterators]/1 //+ input iterator requirements [input.iterators]/2 bool operator!=(all_iterator const& other) const // "UnEqualityComparable" { return iA != other.iA; // should be sufficient (?) } all_reference const operator*() const // *a { return {*iA, *iB, *iC}; } all_reference operator->() // a->m { return {*iA, *iB, *iC}; } all_reference const operator->() const // a->m { return {*iA, *iB, *iC}; } // ++r already satisfied all_iterator operator++(int) // *++r { all_iterator temp(*this); ++(*this); return temp; } //- input iterator requirements [input.iterators]/2 //+ output iterator requirements [output.iterators]/1 // *r = o already satisfied // ++r already satisfied // r++ already satisfied // *r++ = o already satisfied //- output iterator requirements [output.iterators]/1 //+ forward iterator requirements [forward.iterators]/1 all_iterator() = default; // DefaultConstructible // r++ already satisfied // *r++ already satisfied // multi-pass must be guaranteed //- forward iterator requirements [forward.iterators]/1 //+ bidirectional iterator requirements [bidirectional.iterators]/1 all_iterator& operator--() // --r { --iA; --iB; --iC; return *this; } all_iterator operator--(int) // r-- { all_iterator temp(*this); --(*this); return temp; } // *r-- already satisfied //- bidirectional iterator requirements [bidirectional.iterators]/1 //+ random access iterator requirements [random.access.iterators]/1 all_iterator& operator+=(difference_type p) // r += n { iA += p; iB += p; iC += p; return *this; } all_iterator operator+(difference_type p) const // a + n { all_iterator temp(*this); temp += p; return temp; } // doesn't have to be a friend function, but this way, // we can define it here friend all_iterator operator+(difference_type p, all_iterator temp) // n + a { temp += p; return temp; } all_iterator& operator-=(difference_type p) // r -= n { iA -= p; iB -= p; iC -= p; return *this; } all_iterator operator-(difference_type p) const // a - n { all_iterator temp(*this); temp -= p; return temp; } difference_type operator-(all_iterator const& p) // b - a { return iA - p.iA; // should be sufficient (?) } all_reference operator[](difference_type p) // a[n] { return *(*this + p); } all_reference const operator[](difference_type p) const // a[n] { return *(*this + p); } bool operator<(all_iterator const& p) const // a < b { return iA < p.iA; // should be sufficient (?) } bool operator>(all_iterator const& p) const // a > b { return iA > p.iA; // should be sufficient (?) } bool operator>=(all_iterator const& p) const // a >= b { return iA >= p.iA; // should be sufficient (?) } bool operator<=(all_iterator const& p) const // a >= b { return iA <= p.iA; // should be sufficient (?) } //- random access iterator requirements [random.access.iterators]/1 }; }//- namespace MyColumnType_iterator MyColumnType::iterator MyColumnType::begin() { return { a.begin(), b.begin(), c.begin() }; } MyColumnType::iterator MyColumnType::end() { return { a.end(), b.end(), c.end() }; }

Usage example:

 #include <iostream> #include <cstddef> #include <algorithm> namespace MyColumnType_iterator { template < typename char_type, typename char_traits > std::basic_ostream < char_type, char_traits >& operator<< (std::basic_ostream < char_type, char_traits >& o, std::iterator_traits<MyColumnType::iterator>::reference p) { return o << pa << ";" << pb << ";" << pc; } } int main() { using std::cout; MyColumnType mct = { {1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1} , {"j", "i", "h", "g", "f", "e", "d", "c", "b", "a"} , {10, 9, 8, 7, 6, 5, 4, 3, 2, 1} }; using ref = std::iterator_traits<MyColumnType::iterator>::reference; std::copy(mct.begin(), mct.end(), std::ostream_iterator<ref>(cout, ", ")); std::cout << std::endl; std::sort(mct.begin(), mct.end()); std::copy(mct.begin(), mct.end(), std::ostream_iterator<ref>(cout, ", ")); std::cout << std::endl; }

Output:

one; j; 10, 0.9, 1.9, 0.8, h, 8, 0.7, g, 7, 0.6, f, 6, 0.5, e, 5, 0.4, d, 4, 0.3; c ;, 0.2; b; 2, 0.1; a; one,
0,1, a, 1, 0.2, b, 2, 0.3, c 3, 0.4, d 4, 0.5, e 5, 0.6, f 6, 0.7, g, 7, 0.8, h, 8, 0.9; i; 9, 1; j; 10,

(Re) Using std :: algorithms with non-standard containers

More articles: