Why does std :: pair expose member variables?

From http://www.cplusplus.com/reference/utility/pair/ we know that std::pair has two member variables, first and second .

Why did STL designers decide to expose the two member variables, first and second , instead of offering getFirst() and getSecond() ?

+52
c ++ encapsulation stl
Jun 15 '16 at 12:17
source share
7 answers

For the original C ++ 03 std::pair member access functions will not be useful.

As in C ++ 11 and later (now we are in C ++ 14, with a fast approach to C ++ 17) std::pair is a special case of std::tuple , where std::tuple can have any Number items. Thus, it makes sense to have a parameterized getter, since it would be impractical to invent and standardize an arbitrary number of element names. That way you can use std::get for std::pair .

So, the design reasons are historical, that the current std::pair is the end result of evolution towards more generality.




In other news:

about

" . As far as I know, it would be better to encapsulate the two member variables above and give getFirst(); and getSecond()

no, this trash.

It is, as the saying goes, a hammer is always better, whether you are driving in nails, fastening with screws or trimming a piece of wood. Especially in the latter case, a hammer is simply not a useful tool. Hammers can be very useful, but that doesn’t mean that they are “better” at all: it's just nonsense.

+91
Jun 15 '16 at 12:28
source share

Getters and seters are usually useful if you think that getting or setting a value requires additional logic (changing some internal state). This can be easily added to the method. In this case, std::pair used only to provide two data values. No more, no less. And thus adding the verbosity of getter and setter would be pointless.

+29
Jun 15 '16 at 12:30
source share

The reason is that no real invariance needs to be imposed on the data structure, since std::pair models a universal container for two elements. In other words, an object of type std::pair<T, U> is considered valid for any possible first and second element of type T and U , respectively. Similarly, subsequent mutations in the meaning of its elements cannot really affect the reality of std::pair as such.

Alex Stepanov (author of STL) explicitly introduces this general design principle during his course, Effective Programming with Components, by commenting on a singleton container (that is, a container of one element).

Thus, although the principle itself may be a source of controversy, this is the reason for the form std::pair .

+11
Jun 15 '16 at 16:57
source share

Getters and seters are useful if you think that abstraction requires you to isolate users from design choices and changes to these options, now or in the future.

A typical example for "now" is that the installer / receiver may have logic to check and / or calculate the value - for example, use the installer for the phone number instead of directly setting the field so that you can check the Format; use the getter for the collection so that the recipient can provide a read-only view of the element (collection) to the caller.

The canonical ( albeit bad ) example for "future changes" is Point - should you expose x and y or getX() and getY() ? The usual answer is to use getters / setters, because in the future you will want to change the internal representation from Cartesian to polar, and you do not want your users to be exposed (or to depend on this design decision).

In the case of std::pair , it is the intention that this class now and forever represents two and exactly two values ​​(of an arbitrary type) directly and provides their values ​​on demand. It. And so the design uses direct member access, and does not go through getter / setter.

+9
Jun 15 '16 at 22:34
source share

It could be argued that std::pair would be better to have access functions to access its members! In particular, for degenerate cases, std::pair may be an advantage. For example, if at least one of the types is an empty, not finite class, objects can be smaller (an empty base can be made a base that does not need to get its own address).

At the time std::pair was invented, these special cases were not considered (and I'm not sure that empty base optimization was allowed in the draft working document at that time). From the semantic point there is no reason to access the access functions, however: it is clear that accessors need to return a volatile reference for const objects. As a result, the accessor does not provide any form of encapsulation.

On the other hand, this makes it difficult for the optimizer to see what happens when access functions are used. because additional points of the sequence are introduced. I could imagine that Meng Lee and Alexander Stepanov even measured whether there is a difference (and I, too). Even if they did not, providing direct access to members is, of course, no slower than going through an access function, while the opposite is not necessarily true.

I was not part of the solution, and the C ++ standard has no justification, but I assume that it was a deliberate decision to make members of public data members members.

+8
Jun 15 '16 at 16:38
source share

The main goal of getters and setters is to gain access control. That is, if you expose the "first" as a variable, any class can read and write (if not const ) without telling the class in which it belongs. In some cases, this can create serious problems.

For example, let's say you have a class that represents the number of passengers on a boat. You save the number of passengers as an integer. If you expose this number as an empty variable, external functions can be written for it. This may leave you in the case when there are actually 10 passengers, but someone changed the variable (possibly by accident) to 50. This is the case for the recipient in terms of the number of passengers (but not the setter, which will present the same problem) .

An example for getters and seters would be a class that is a mathematical vector in which you want to cache certain information about a vector. Suppose you want to keep the length. In this case, changing vec.x is likely to change length / size. Thus, you need to not only make x wrapped in a getter, you must provide a setter for x that knows to update the length of the cached vector. (Of course, most real math libraries do not cache these values ​​and therefore expose variables.)

So, the question that you should ask yourself in the context of their use is this: will this class really need to be monitored or warned about changes to this variable?

The answer in something like std :: pair is a flat no. There is nothing to do with controlling access to class members whose sole purpose is to contain these members. Of course, couples do not need to know whether these variables were affected, given that these are only two of its members, and therefore it does not have the state to update, it must either change. the couple does not know what it actually contains, and its significance, so tracking what it contains is not worth the effort.

Depending on the compiler and how it is configured, getters and seters may enter overhead. This is probably not important in most cases, but if you put them on something fundamental like std::pair , that would be a non-trivial problem. Thus, their addition would have to be justified - which, as I just explained, cannot be.

+4
Jun 15 '16 at 21:30
source share

I was shocked by the number of comments that did not show a basic understanding of object-oriented design (does this prove that C ++ is not an OO language?). Yes, the std :: pair design has some historical features, but it does not make a bad design good; and it should not be used as an excuse to deny this fact. Before I pounced on him, let me answer some of the questions in the comments:

Don't you think that int should also have a setter and getter

Yes, from a design point of view, we should use accessories because we don’t lose anything, but we get additional flexibility. Some new algorithms may want to pack extra bits into a key / value, and you cannot encode / decode them without accessories.

Why wrap something in a getter if there is no logic in the getter?

How do you know that there will be no logic in the getter / setter? Good design should not limit the possibility of implementation based on the assumption. It should offer as much flexibility as possible. Remember that the std: pair design also solves the iterator design, and by inviting users to directly access member variables, the iterator should return structures that actually store the key / values ​​together. This turns out to be a big limitation. There are algorithms that must contain them separately. There are algorithms that do not explicitly store the key / values. Now they should copy the data during the iteration.

Contrary to popular belief, having objects that do nothing but store member variables using getters and setters is not a “way to do it”

Another wild guess.

OK, I would stay here.

To answer the original question: std :: pair decided to set up member variables because the one who designed it did not recognize and / or set the priority of the importance of the flexible contract. Obviously, they had a very narrow idea / vision of how key value pairs should be implemented in a map / hash table, and to worsen them, they allowed such a narrow look at the implementation to spread from above to jeopardize the design. For example, what if I want to implement the replacement std: unordered_map, which stores the key and values ​​in separate arrays based on an open addressing scheme with linear sensing? This can significantly improve cache performance for pairs with small keys and large values, since you do not need to take a long jump through the spaces occupied by the values ​​to verify the keys. If std :: pair chose accessors, it would be trivial to write an STL style iterator. But now it is simply impossible to achieve this without causing additional data copying.

I noticed that they also instructed to use open hashing (i.e. a closed connection) to implement std :: unordered_map. This is not only strange from a design point of view (why do you want to limit how everything is implemented?), But also rather dumb from an implementation point of view. Chained hash tables using a linked list are arguably the slowest of all categories. Go to Google on the Internet, we can easily find that std: unordered_map is often the cover of a hash table test. It even tends to be slower than Java HashMap (I don’t know how they managed to lag in this case, since HashMap is also a hash table). The old excuse is that a chain hash table tends to improve when load_factor approaches 1, which is completely invalid because 1) there are many methods in the open address family to solve this problem - have you ever heard of hopping or hashing with with the help of a robin-hood, the latter has really been there for 30 bizarre years; 2) the chain hash table adds pointer overhead (good 8 bytes on 64-bit machines) for each record, so when we say that the load_factor of the unordered map is approaching 1, this is not 100% memory usage! We must take this into account and compare the performance of unordered_map with alternatives with the same memory usage. And it turns out that alternatives like Google Dense HashMap are 3-4 times faster than std :: unordered_map.

Why are they relevant? Because it is interesting that mandatory open hashing makes the std :: pair design less bad, now that we do not need the flexibility of an alternative storage structure. Moreover, the presence of std :: pair makes it almost impossible to adopt newer / better algorithms for writing a replacement replacement std :: unordered_map. Sometimes you wonder if they did it intentionally so that the poor design of std :: pair and the pedestrian implementation of std :: unordered_map can survive longer. Of course, I'm joking, so whoever writes, do not be offended. In fact, people using Java or Python (well, I allow Python to stretch) would like to thank you for feeling good about being "as fast as C ++."

+1
Apr 12 '17 at 5:12
source share



All Articles