Copying classes between ranges that may overlap

In C, we have memcpy and memmove functions to efficiently copy data. The first gives undefined behavior if the source and destination areas overlap, but the latter is guaranteed to deal with the “as expected”, presumably noting the direction of the overlap and (if necessary) choosing a different algorithm.

The above functions are available in C ++ (like std::memcpy and std::memmove ), but they really don't work with non-trivial class es. Instead, we get std::copy and std::copy_backward . Each of them works if the source and destination ranges do not overlap; in addition, each is guaranteed to work for one "direction" of overlap.

What can we use if we want to copy from one region to another, and we do not know at compile time if the ranges can intersect or in which direction overlap can occur? We seem to have a choice. For a general iterator can be difficult to determine if ranges overlap, so I understand why there is no solution in this case, but what about directions? Ideally, be a function like:

 template<class T> T * copy_either_direction(const T * inputBegin, const T * inputEnd, T * outputBegin) { if ("outputBegin ∈ [inputBegin, inputEnd)") { outputBegin += (inputEnd - inputBegin); std::copy_backward(inputBegin, inputEnd, outputBegin); return outputBegin; } else { return std::copy(inputBegin, inputEnd, outputBegin); } } 

(A similar function with T * , replaced by std::vector<T>::iterator , would also be nice. It would be even better if it were guaranteed to work if inputBegin == outputBegin , but a separate gripe of mine .)

Unfortunately, I don’t see a reasonable way to write a condition in an if , because comparing pointers to individual memory blocks often leads to undefined behavior. On the other hand, the implementation clearly has its own way of doing this, since std::memmove essentially requires one. Thus, any implementation can provide such a function, thereby filling a need that the programmer simply cannot. Since std::memmove was considered useful, why not copy_either_direction ? Is there a solution that I am missing?

+6
source share
3 answers

memmove works because it passes pointers to adjacent bytes, so the ranges of the two blocks to be copied are well defined. copy and move accept iterators, which do not necessarily indicate adjacent ranges. For example, a list iterator may move in memory; there is no range that the code can look at, and no meaningful concept of overlap.

0
source

I recently learned that std::less specialized for pointers in such a way as to provide a general order, presumably to allow storing pointers in std::set and its associated class es. Assuming this should be consistent with the standard order whenever the latter is determined, I think the following will work:

 #include <functional> template<class T> T * copy_either_direction(const T * inputBegin, const T * inputEnd, T * outputBegin) { if (std::less<const T *>()(inputBegin, outputBegin)) { outputBegin += (inputEnd - inputBegin); std::copy_backward(inputBegin, inputEnd, outputBegin); return outputBegin; } else { return std::copy(inputBegin, inputEnd, outputBegin); } } 
0
source

What can we use if we want to copy from one region to another, and we do not know at compile time if the ranges can intersect or in which direction a coincidence can occur?

This is not a coherent concept.

After the copy operation, you will have two objects. And each object is defined by a separate and separate memory area. You cannot have objects that overlap this way (you can have subobjects, but the type of an object cannot be its own subobject). And therefore, it is impossible to copy an object on top of the part itself.

Moving an object on top of itself is also not logically consistent. What for? Because moving is fiction in C ++; after moving, you still have two perfectly functional objects. A move operation is simply a destructive copy that steals resources belonging to another object. It still exists, and it is still a viable object.

And since the object cannot intersect with another object, this is again impossible.

Trivially copied types go around this because they are just blocks of bits, without destructors or specialized copy operations. Therefore, their life is not as harsh as that of others. A type that cannot be trivially copied cannot do this because:

Experience with memmove suggests that in this case there may be a solution (and possibly also for iterators in adjacent containers).

This is not possible and generally not desirable for types that are not trivially copied in C ++.

The rules of trivial copyability are that the type does not have non-trivial copy / move / assign constructor operators, and there is no non-trivial destructor. The trivial constructor / purpose of copy / move is nothing more than memcpy, and the trivial destructor does nothing. And therefore, these rules effectively guarantee that a type is nothing more than a "block of bits." And one “block of bits” is no different from the other, so copying it through memmove is a legal construct.

If the type has a real destructor, then the type supports some invariant, which requires real efforts to maintain. It can free a pointer or free a file descriptor or whatever. Given this, it makes no sense to copy bits, because now you have two objects that reference the same pointer / file pointer. This is bad because a class usually wants to control how it is handled.

This problem cannot be resolved if the class itself is not involved in the copy operation. Different classes have different behaviors regarding the management of their internal elements. In fact, this is the whole purpose of objects that have copy constructors and assignment operators. So that the class itself can decide how to maintain the sanity of its own condition.

And it doesn't even have to be a pointer or file. Each instance of the class may have a unique identifier; such a value is generated during construction, and it is never copied (new copies get new values). So that you violate this restriction with memmove , you will leave your program in an undefined state because you will have code that expects such identifiers to be unique.

Thus, memmove ing for non-trivial copied types gives undefined behavior.

-1
source

All Articles