Insertion order in sets (when parsing {})

Someone asked here why when placing 1 and True in set , only 1 supported.

This, of course, is because 1==True . But in what cases is 1 saved and in what cases is true saved?

We'll see:

passing list to construct set instead of using the notation set :

 >>> set([True,1]) {True} >>> set([1,True]) {1} 

it seems logical: set iterates in the internal list and does not add the second element because it is equal to the first element (note that set([True,1]) cannot give 1 because set cannot know what is inside the list. It may not even be list , but repeatable)

Now using the notation set :

 >>> {True,1} {1} >>> {1,True} {True} 

It seems that in this case the list of elements is processed in the reverse order (tested on Python 2.7 and Python 3.4).

But is it guaranteed? Or just an implementation detail?

+11
python set python-internals
01 Sep '17 at 16:17
source share
3 answers

The order in which elements are inserted in the installation literal is not guaranteed by the language specification. However, Python 3.6 has been modified to have the expected left-right order of evaluation. For complete information about this change, issue , as well as commit , which made changes to the insertion order.




To describe the change in more detail, constructing the literal set {True, 1} launches the operation code BUILD_SET (with oparg equal to 2) after the first push of the pointers to True and 1 to the internal virtual machine stack.

In Python 3.4, BUILD_SET uses the following loop to insert elements into a set (note that in our case, oparg is 2):

 while (--oparg >= 0) { PyObject *item = POP(); if (err == 0) err = PySet_Add(set, item); Py_DECREF(item); 

Since 1 was added to the last stack, it first slipped out and became the first object inserted into the set.

In newer versions of Python (e.g. 3.6), opcode BUILD_SET uses PEEK instead of POP :

 for (i = oparg; i > 0; i--) { PyObject *item = PEEK(i); if (err == 0) err = PySet_Add(set, item); Py_DECREF(item); 

PEEK(i) pushes the item i th onto the stack, so for {True, 1} the True object is added to the set in the first place.

+5
Sep 01 '17 at 18:54 on
source share

The order on the left and right on the display shows the documentation :

its elements are evaluated from left to right and added to the given object

Example:

 >>> def f(i, seq="left to right".split()): print(seq[i]) >>> {f(0), f(1), f(2)} left to right {None} 

Therefore, {1, True} effective:

 >>> S = set() >>> S.add(1) >>> S.add(True) # no effect according to docs >>> S {1} 

A set can contain only one of True or 1 , since they are duplicates in terms of set :

 >>> hash(1) == hash(True) True >>> 1 == True True 

On Python 3, it is guaranteed that 1 == True . See Is `False == 0 and True == 1 in Python an implementation detail or is it guaranteed by the language? :

Booleans: they represent true values ​​False and True [...] Boolean values ​​behave like values ​​0 and 1, respectively, in almost all contexts, the exception is that when converting to a string, the string is "False" or "True", respectively.

If {1, True} prints {True} , then this is an error ("the evaluation order of set_display does not match the documented behavior" ). The output should be the same as set([1, True]) . It works as expected ( {1} ) in recent versions of pypy, jython and cpython 2.7.13+, 3.5.3+.

+1
Sep 01 '17 at 20:14
source share

From one of the latest versions, dict preserves order as a side effect of implementation details. In 3.7 this behavior can be guaranteed. Perhaps this also affected the set of literals.

Python 3.6.2:

 >>> {True,1} {True} >>> {1,True} {1} >>> set([True,1]) {True} >>> set([1,True]) {1} 
0
01 Sep '17 at 16:20
source share



All Articles