Hey, so your question is:
"When a function returns an instance of a class by value and assigns it a reference to a constant, does this prevent the copy constructor from being called?"
Ignoring the life time of a temporary, as this is not a question that you ask, we can understand what is happening by looking at the assembly. Im using clang, llvm 7.0.2.
Here is something standard. Come back by value, nothing unusual.
Test a
class MyClass { public: MyClass(); MyClass(const MyClass & source); long int m_tmp; }; MyClass createMyClass(); int main() { const MyClass myClass = createMyClass(); return 0; }
If I compile with "-O0 -S -fno-elide-constructors", I get this.
_main: pushq %rbp # Boiler plate movq %rsp, %rbp # Boiler plate subq $32, %rsp # Reserve 32 bytes for stack frame leaq -24(%rbp), %rdi # arg0 = &___temp_items = rdi = rbp-24 movl $0, -4(%rbp) # rbp-4 = 0, no idea why this happens callq __Z13createMyClassv # createMyClass(arg0) leaq -16(%rbp), %rdi # arg0 = & myClass leaq -24(%rbp), %rsi # arg1 = &__temp_items callq __ZN7MyClassC1ERKS_ # MyClass::MyClass(arg0, arg1) xorl %eax, %eax # eax = 0, the return value for main addq $32, %rsp # Pop stack frame popq %rbp # Boiler plate retq
We consider only the calling code. Not interested in implementing createMyClass. Thats compiled elsewhere. Thus, createMyClass creates the class inside the temporary one and then is copied to myClass.
Simples.
What about const ref version?
Test b
class MyClass { public: MyClass(); MyClass(const MyClass & source); long int m_tmp; }; MyClass createMyClass(); int main() { const MyClass & myClass = createMyClass(); return 0; }
The same compiler options.
_main: # Boiler plate pushq %rbp # Boiler plate movq %rsp, %rbp # Boiler plate subq $32, %rsp # Reserve 32 bytes for the stack frame leaq -24(%rbp), %rdi # arg0 = &___temp_items = rdi = rbp-24 movl $0, -4(%rbp) # *(rbp-4) = 0, no idea what this is for callq __Z13createMyClassv # createMyClass(arg0) xorl %eax, %eax # eax = 0, the return value for main leaq -24(%rbp), %rdi # rdi = &___temp_items movq %rdi, -16(%rbp) # &myClass = rdi = &___temp_items; addq $32, %rsp # Pop stack frame popq %rbp # Boiler plate retq
There is no copy constructor and, therefore, a more optimal right?
What happens if we disable "-fno-elide-constructors" for both versions? Saving -O0.
Test a
_main: pushq %rbp # Boiler plate movq %rsp, %rbp # Boiler plate subq $16, %rsp # Reserve 16 bytes for the stack frame leaq -16(%rbp), %rdi # arg0 = &myClass = rdi = rbp-16 movl $0, -4(%rbp) # rbp-4 = 0, no idea what this is callq __Z13createMyClassv # createMyClass(arg0) xorl %eax, %eax # eax = 0, return value for main addq $16, %rsp # Pop stack frame popq %rbp # Boiler plate retq
Clang removed the copy constructor call.
Test b
_main: # Boiler plate pushq %rbp # Boiler plate movq %rsp, %rbp # Boiler plate subq $32, %rsp # Reserve 32 bytes for the stack frame leaq -24(%rbp), %rdi # arg0 = &___temp_items = rdi = rbp-24 movl $0, -4(%rbp) # rbp-4 = 0, no idea what this is callq __Z13createMyClassv # createMyClass(arg0) xorl %eax, %eax # eax = 0, return value for main leaq -24(%rbp), %rdi # rdi = &__temp_items movq %rdi, -16(%rbp) # &myClass = rdi addq $32, %rsp # Pop stack frame popq %rbp # Boiler plate retq
Test B (assign to reference const) is the same as before. Now it has more instructions than Test A.
What if we set the optimization to -O1?
_main: pushq %rbp # Boiler plate movq %rsp, %rbp # Boiler plate subq $16, %rsp # Reserve 16 bytes for the stack frame leaq -8(%rbp), %rdi # arg0 = &___temp_items = rdi = rbp-8 callq __Z13createMyClassv # createMyClass(arg0) xorl %eax, %eax # ex = 0, return value for main addq $16, %rsp # Pop stack frame popq %rbp # Boiler plate retq
Both source files turn into this when compiling with -O1. They lead to the same assembler. This is also true for -O4.
The compiler does not know about the contents of createMyClass, so it cannot do anything to optimize.
With the compiler that I use, you do not get a performance boost from assigning const ref.
I imagine a similar situation for g ++ and intel, although it is always good to check.