The fastest way to reset values โ€‹โ€‹in each structural element of a vector?

I really like this question , except that instead of vector<int> I have vector<struct myType> .

If I want to reset (or, for that matter, set some value) myType.myVar for each element of the vector, then what is the most efficient method?

Now I repeat:

 for(int i=0; i<myVec.size(); i++) myVec.at(i).myVar = 0; 

But since vectors are guaranteed to be kept in contact, of course, the best way?

+8
c ++ performance vector
source share
6 answers

A reset will have to intersect each element of the vector, so it will need at least O (n) complexity. Your current algorithm accepts O (n).

In this particular case, you can use operator[] instead of at (which may throw an exception). But I doubt that this is a bottleneck in your application.

In this post, you should probably use std::fill :

 std::fill(myVec.begin(), myVec.end(), 0); 

But if you do not want to go to the byte level and set a piece of memory to 0 , which will not only cause you headaches, but in most cases will lose portability, there is nothing to improve.

+7
source share

Instead of the code below

 for(int i=0; i<myVec.size(); i++) myVec.at(i).myVar = 0; 

do it like this:

 size_t sz = myVec.size(); for(int i=0; i<sz; ++i) myVec[i].myVar = 0; 

How the "at" method internally checks to see if the pointer is out of range or not. But as your loop index takes care of (myVec.size ()), you can avoid extra checking. Otherwise, this is the fastest way to do this.

EDIT

In addition to this, we can save the size () of the vector before executing the for loop. This would ensure that there is no further call to the size () method in the for loop.

+6
source share

One of the fastest ways would be to unwind the loop and break the speed limit created by the usual for loops, which cause large "strong" money leaks. In your case, since it's runtime, there is no way to apply template metaprogramming, so variation on the old Duff device will do the trick

 #include <iostream> #include <vector> using namespace std; struct myStruct { int a; double b; }; int main() { std::vector<myStruct> mV(20); double val(20); // the new value you want to reset to int n = (mV.size()+7) / 8; // 8 is the size of the block unroll auto to = mV.begin(); // switch(mV.size()%8) { case 0: do { (*to++).b = val; case 7: (*to++).b = val; case 6: (*to++).b = val; case 5: (*to++).b = val; case 4: (*to++).b = val; case 3: (*to++).b = val; case 2: (*to++).b = val; case 1: (*to++).b = val; } while (--n>0); } // just printing to verify that the value is set for (auto i : mV) std::cout << ib << std::endl; return 0; } 

Here I choose to execute an 8-block unwind, until the value of (say) b of the myStruct structure is myStruct . The block size can be changed, and the loops will be effectively deployed. Remember that this is the basic method in memcpy and one of the optimizations (in the general case, the loop unfolding) that the compiler will try (in fact, they are pretty good at this, so we could also let them do their work).

+2
source share

In addition to what was said earlier, you should think that if you turn on optimization, the compiler will most likely perform a loop-round, which will make the loop faster.

+1
source share

Also, pre increment ++i accepts fewer instructions than post increment i++ . Explanation here

+1
source share

Beware of wasting a lot of time thinking about the optimization details that the compiler will simply take care of you.

Here are four implementations of what I understand as OP, as well as code generated using gcc 4.8 with --std=c++11 -O3 -S

Ads:

 #include <algorithm> #include <vector> struct T { int irrelevant; int relevant; double trailing; }; 

Explicit loop implementations, roughly from the answers and comments provided by the OP. Both produce identical machine code besides labels.

  .cfi_startproc movq (%rdi), %rsi void clear_relevant(std::vector<T>* vecp) { movq 8(%rdi), %rcx for(unsigned i=0; i<vecp->size(); i++) { xorl %edx, %edx vecp->at(i).relevant = 0; xorl %eax, %eax } subq %rsi, %rcx } sarq $4, %rcx testq %rcx, %rcx je .L1 .p2align 4,,10 .p2align 3 .L5: void clear_relevant2(std::vector<T>* vecp) { salq $4, %rdx std::vector<T>& vec = *vecp; addl $1, %eax auto s = vec.size(); movl $0, 4(%rsi,%rdx) for (unsigned i = 0; i < s; ++i) { movl %eax, %edx vec[i].relevant = 0; cmpq %rcx, %rdx } jb .L5 } .L1: rep ret .cfi_endproc 

Two other versions, one using std::for_each and the other using the syntax for range. There is a slight difference in the code for the two versions (except for the labels):

  .cfi_startproc movq 8(%rdi), %rdx movq (%rdi), %rax cmpq %rax, %rdx je .L17 void clear_relevant3(std::vector<T>* vecp) { .p2align 4,,10 for (auto& p : *vecp) p.relevant = 0; .p2align 3 } .L21: movl $0, 4(%rax) addq $16, %rax cmpq %rax, %rdx jne .L21 .L17: rep ret .cfi_endproc .cfi_startproc movq 8(%rdi), %rdx movq (%rdi), %rax cmpq %rdx, %rax void clear_relevant4(std::vector<T>* vecp) { je .L12 std::for_each(vecp->begin(), vecp->end(), .p2align 4,,10 [](T& o){o.relevant=0;}); .p2align 3 } .L16: movl $0, 4(%rax) addq $16, %rax cmpq %rax, %rdx jne .L16 .L12: rep ret .cfi_endproc 
+1
source share

All Articles