You seem to be afraid that the elite failed. This auto x = some_func(); will result in an extra move over auto&& x = some_func(); when some_func() returns a temporary object.
You should not be.
If elision fails, it means your compiler is incompetent or compiles with overtly hostile settings. And you cannot survive hostile settings or incompetent compilers: incompetent compilers can turn a+=b with integers into for (int i = 0; i < abs(b); ++i) {if (b>0) ++a; else --a;} for (int i = 0; i < abs(b); ++i) {if (b>0) ++a; else --a;} and violate more than one iota of the standard.
Elision is a feature of the main language. Do not write bad code just because you do not believe that this will happen.
You must commit the link if you want a link to the data provided by the function, and not an independent stable copy of the data. If you do not understand the service life of the return value of a function, capturing by reference is simply unsafe.
Even if you know that the data will be stable, more time is spent on maintaining the code than writing: the person reading your code should be able to immediately see that your assumption is true. And non-local errors are bad: it would seem that a harmless change to the function you are calling should not violate your code.
End result: take things at a cost if you have no good reason.
Taking things by value makes it easier for you and the compiler to explain your code. This increases locality.
When you have a good reason not to do this, then take things from the link.
Depending on the context, this good reason may not need to be very strong. But this should not be based on the assumption of an incompetent compiler.
Premature pessimization should be avoided, but this should be premature optimization. Taking (or saving) links instead of values should be something you do when and if you find performance issues. Write clean, clear code. Drag complexity into densely written types, and the external interface will be clean and simple. Take things by value because the values are deformed.
Optimization is interchangeable. By making more code simpler, you can simplify your work (and be more productive). Then, when you identify the details in which performance is important, you can spend some energy there to make the code faster.
A great example is foundational code: foundational code (which is used everywhere) quickly becomes a general resistance to performance if it is not written with regard to performance and ease of use. In this case, you want to hide the complexity inside the type and expose a simple easy-to-use external interface that does not require the user to understand the guts.
But some code in some random function? Use values, the easiest containers to use, with the friendliest O-notation for the most expensive operation you perform, and the simplest interface. Vectors, if reasonable (avoid premature pessimization), but don't sweat a few cards.
Find 1% -10% of your code that takes 90% -99% of the time and will do it quickly. If the rest of your code has good O-notation performance (so it won't get terribly slower with large data sets than you are testing), you'll be in good shape. Then start testing with funny data sets and find the slow parts.