Python reference data links, list of duplicate links

Question

Python reference data links, list of duplicate links

Let's say I have two lists:

>>> l1=[1,2,3,4] >>> l2=[11,12,13,14]

I can put these lists in a tuple or dictionary, and it seems that they all refer to the original list:

 >>> t=(l1,l2) >>> d={'l1':l1, 'l2':l2} >>> id(l1)==id(d['l1'])==id(t[0]) True >>> l1 is d['l1'] is t[0] True

Since they are links, I can change l1 , and the corresponding data in the tuple and dictionary changes accordingly:

 >>> l1.append(5) >>> l1 [1, 2, 3, 4, 5] >>> t ([1, 2, 3, 4, 5], [11, 12, 13, 14]) >>> d {'l2': [11, 12, 13, 14], 'l1': [1, 2, 3, 4, 5]}

Including if I add a link in the dictionary d or a mutable link in the tuple t :

 >>> d['l1'].append(6) >>> t[0].append(7) >>> d {'l2': [11, 12, 13, 14], 'l1': [1, 2, 3, 4, 5, 6, 7]} >>> l1 [1, 2, 3, 4, 5, 6, 7]

If now I set l1 to a new list, the reference count for the original list is reduced:

 >>> sys.getrefcount(l1) 4 >>> sys.getrefcount(t[0]) 4 >>> l1=['new','list'] >>> l1 is d['l1'] is t[0] False >>> sys.getrefcount(l1) 2 >>> sys.getrefcount(t[0]) 3

And adding or changing l1 does not change d['l1'] or t[0] , since now it is a new link. The notion of indirect references is well described in Python docs, but not completely.

My questions:

Is a mutable object always a reference? Can you always assume that changing it changes the original (unless you specifically make a copy with l2=l1[:] idiom type)?
Can I compile a list of all the same links in Python? those. some function f(l1) that returns ['l1', 'd', 't'] if they all belong to the same list?
I believe that no matter what, the data will remain valid as long as there is some kind of link to it.

t

 l=[1,2,3] # create an object of three integers and create a ref to it l2=l # create a reference to the same object l=[4,5,6] # create a new object of 3 ints; the original now referenced # by l2 is unchanged and unmoved

+7

python reference data-structures

dawg Dec 29 '10 at 19:36

source share

6 answers

Each variable in Python is a reference.

For lists, you focus on the results of the append() method and lose the appearance of a larger image of Python data structures. There are other methods on lists, and there are advantages and consequences to building a list. It is useful to think of the list as representing other objects mentioned in the list. They do not “contain” anything but the rules and methods of accessing the data referenced by the objects within them.

The list.append(x) method , in particular, is equivalent to l[len(l):]=[list]

So:

 >>> l1=range(3) >>> l2=range(20,23) >>> l3=range(30,33) >>> l1[len(l1):]=[l2] # equivalent to 'append' for subscriptable sequences >>> l1[len(l1):]=l3 # same as 'extend' >>> l1 [0, 1, 2, [20, 21, 22], 30, 31, 32] >>> len(l1) 7 >>> l1.index(30) 4 >>> l1.index(20) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: list.index(x): x not in list >>> 20 in l1 False >>> 30 in l1 True

By placing the list constructor around l2 in l1[len(l1):]=[l2] or by calling l.append(l2) , you will create a link bound to l2. If you change l2, the links will also show the change. The length of this in the list is one element - a link to the attached sequence.

Without the constructor shortcut, as in l1[len(l1):]=l3 , you copy each element of the sequence.

If you use other common list methods, such as l.index(something) or in , you will not find elements inside the data links. l.sort() will not be sorted properly. These are “small” operations on an object, and using l1[len(l1):]=[l2] you create a recursive data structure.

If you use l1[len(l1):]=l3 , you create a true (shallow) copy of the elements in l3 .

These are fairly fundamental Python idioms, and most of the time they "do the right thing." However, you may get unexpected results, for example:

 >>> m=[[None]*2]*3 >>> m [[None, None], [None, None], [None, None]] >>> m[0][1]=33 >>> m [[None, 33], [None, 33], [None, 33]] # probably not what was intended... >>> m[0] is m[1] is m[2] # same object, that why they all changed True

Some Python newbies are trying to create multidimensionality by doing something like m=[[None]*2]*3 The first replication sequence works as expected; he creates 2 copies of None . This is the second problem: it creates three copies of the link to the first list. Thus, entering m[0][1]=33 modifies the list inside the list associated with m, and then all related links change to show this change.

Compare with:

 >>> m=[[None]*2,[None]*2,[None]*2] >>> m [[None, None], [None, None], [None, None]] >>> m[0][1]=33 >>> m [[None, 33], [None, None], [None, None]]

You can also use nested lists to do the same:

 >>> m=[[ None for i in range(2)] for j in range(3)] >>> m [[None, None], [None, None], [None, None]] >>> m[0][1]=44 >>> m [[None, 44], [None, None], [None, None]] >>> m[0] is m[1] is m[2] # three different lists.... False

For listings and links, Fredrik Lundh has this text for good input.

Regarding your specific questions:

1) In Python, Everything is a label or a reference to an object. There is no “original” (C ++ concept), and there is no difference between a “link”, a pointer, or actual data (C / Perl concept)

2) Fredrik Lund has a great analogy regarding a reference to a question like this :

Just like you get the name that the cat you found on the porch: cat (the object) cannot tell you its name, and in fact it does not bother - so the only way to find out what he called is to ask all your neighbors ( namespaces) if it's their cat (object) ...
.... and do not be surprised if you find out that he is known by many names, or if there is no name at all!

You can find this list with some effort, but why? Just call him what you call - like a cat found.

3) True.

+3

the wolf Dec 30 '10 at 5:58

source share

1- Is a mutable object always Help? You can always assume that its modification changes the original (unless you specifically make a copy with l2 = l1 [:] kind of idiom)?

Yes. In fact, non-variable objects are always reference. You simply cannot change them to perceive it.

2 - Can I put together a list of all the same links in Python? those. some function f (l1) that returns ['l1', 'd', 't'] if all this is referring to the same list?

This is odd, but it can be done.

You can compare objects for "samenes" with the is operator. Like in l1 is t[0]

And you can get all the related objects with the gc.get_referrers function in the garbage collector (gc) module - you can check which of these sources points your object with the is operator. So yes, it can be done. I just don’t think it would be a good idea. Most likely, the operator’s suggestion is way for you to do what you need alone

3- This is my assumption that no matter what, the data will remain valid as long as there is some kind of link to it.

Yes.

0

jsbueno Dec 29 '10 at 20:15

source share

Is a mutable object always a reference? Can you always assume that its modification changes the original (unless you specifically make a copy with l2 = l1 [:] kind of idiom)?

Python has reference semantics: variables do not matter if stored , as in C ++, and instead of them label . The concept of “original” is erroneous: if two variables denote the same value, it does not matter which one “came first”. It does not matter whether the object is modified or not (except that immutable objects will not make it easy to understand what is happening behind the scenes). To make copies more universal, try the copy module.

Can I compile a list of all the same links in Python? those. some function f (l1) that returns ['l1', 'd', 't'] if they all belong to the same list?

Not easy. See aaronasterling answer for more details. You can also try something like k for k, v in locals().items() if v is the_object , but you will also have to look for globals() , you will miss some things and this may cause some problems due to recursion with the names "k" and "v" (I have not tested).

It is my assumption that in spite of everything, the data will remain valid as long as there is some kind of link to it.

That's right.

0

Karl Knechtel Dec 29 '10 at 20:20

source share

"... the object is a link ..." is nonsense. Links are not objects. Variables, member fields, slots in lists and sets, etc. They contain links, and these links point to objects. There can be any number (in implementations without refcouting, even without - temporarily, that is, until the GC starts) refers to the object. Anyone who has a link to an object can refer to its methods, access them, etc. - this is true for all objects. Of course, only modified objects can be modified this way, so you usually don't need immutable ones.
Yes, as others have shown. But this is not necessary if you are not debugging the GC or tracking a serious memory leak in your code - why do you think you need it?
Python has automatic memory management, so yes. As long as there is a reference to an object, it will not be deleted (however, it may remain alive for some time after it becomes unavailable due to circular references and the fact that GCs only start from time to time).

0

delnan Dec 29 '10 at 20:20

source share

 1a. Is a mutable object always a reference?

There is no difference between mutable and immutable objects. Seeing variable names as references is useful for people with a C background (but implies that they can be dereferenced, that they cannot).

 1b. Can you always assume that modifying it modifies the original

Please, this is not the "original". This is the same object. b = a means that b and a are now the same object.

 1c. (Unless you specifically make a copy with l2=l1[:] kind of idiom)?

That's right, because then it is not the same object. (Although the entries of the n list will be the same objects as the original list).

 2. Can I assemble a list of all the same references in Python?

Yes, it is possible, but you will never need it, so it will be a waste of energy. :)

 3. It is my assumption that no matter what, the data will remain valid so long as there is some reference to it.

Yes, the object will not collect garbage if you have a link to it. (The use of the word "valid" here seems wrong, but I assume that this is what you mean).

0

Lennart Regebro Dec 30 '10 at 6:12

source share

aaronasterling · Accepted Answer · 2010-12-29T20:12:04+0000

1) Changing a mutable object using a link will always change the "original". Honestly, this betrays misunderstanding of the links. The new link is just the "original", like any other link. As long as both names point to the same object, changing the object through any name will be displayed when accessed through a different name.

2) Not quite the way you want. gc.get_referrers returns all references to an object.

 >>> l = [1, 2] >>> d = {0: l} >>> t = (l, ) >>> import gc >>> import pprint >>> pprint.pprint(gc.get_referrers(l)) [{'__builtins__': <module '__builtin__' (built-in)>, '__doc__': None, '__name__': '__main__', '__package__': None, 'd': {0: [1, 2]}, 'gc': <module 'gc' (built-in)>, 'l': [1, 2], 'pprint': <module 'pprint' from '/usr/lib/python2.6/pprint.pyc'>, 't': ([1, 2],)}, # This is globals() {0: [1, 2]}, # This is d ([1, 2],)] # this is t

Note that the actual object referenced by l is not included in the returned list, because it does not contain a reference to itself. globals() returned because it contains a link to the source list.

3) If valid, you mean that "garbage will not be collected", then this is correct, which prohibits a very unlikely error. It would be a very miserable garbage collector that "stole" your data.

Python reference data links, list of duplicate links

More articles: