Python name mangling

Question

Python name mangling

In other languages, a general guide that helps you create better code always makes everything as hidden as possible. If you have doubts as to whether a variable should be private or protected, it is better to go with a private one.

Is the same true for Python? Should I first use the two leading underscores in everything, and only make them less hidden (only one underline), since I need them?

If the agreement should use only one underscore, I would also like to know the rationale.

Here is the comment I posted on the JBernardo answer . This explains why I asked this question and also why I would like to know why Python is different from other languages:

I come from languages that teach you to think that everything should only be publicly available, if necessary, and no more. The reason is that this will reduce dependency and make the code more secure to change. Python's way of doing things in the opposite order - starting with the public and going towards the hidden ones - is strange to me.

+85

python naming-conventions

Paul Manta Sep 17 '11 at 18:07

source share

11 answers

I would not say that practice gives the best code. Visibility modifiers only distract you from the task and make you use your interface as you wish as a side effect. Generally speaking, increased visibility prevents programmers from confusing things if they do not read the documentation correctly.

The best solution is the route that Python encourages: your classes and variables should be well documented and their behavior clear. Source must be available. This is a much more extensible and reliable way to write code.

My strategy in Python is this:

Just write this damn thing, don’t make any assumptions about how your data should be protected. This assumes that you are writing to create the perfect interfaces for your problems.
Use the leading underscore for materials that are likely not to be used externally and will not be part of the regular client code interface.
Use double underlining only for things that are purely comfortable inside the class, or that can cause significant damage if they were accidentally discovered.

First of all, it should be clear that everything does. Document this if someone else will use it. Document this if you want it to be useful in a year.

As a side note, you should actually defend yourself in these other languages: you never know that your class can be inherited later and what it can be used for. It is best to protect only those variables that you are sure cannot or should not be used by external code.

+15

Matt Joiner Sep 17 '11 at 18:25

source share

First - What is called mangling?

Name manipulation is called when you are in a class definition and use __any_name or __any_name_ , that is, two (or more) leading underscores and no more than one underscore.

 class Demo: __any_name = "__any_name" __any_other_name_ = "__any_other_name_"

And now:

 >>> [n for n in dir(Demo) if 'any' in n] ['_Demo__any_name', '_Demo__any_other_name_'] >>> Demo._Demo__any_name '__any_name' >>> Demo._Demo__any_other_name_ '__any_other_name_'

When do you doubt that?

The explicit use is to prevent subclasses from using the attribute that the class uses.

The potential value is to avoid name collisions with subclasses that want to override the behavior, so that the functionality of the parent class continues to work as expected. However, the example in the Python documentation is not replaceable by Liskov, and no examples come to mind where I found this useful.

The disadvantage is that it increases the cognitive load for reading and understanding the code base, and especially when debugging, when you see double underscores in the source and a distorted name in the debugger.

My personal approach is to avoid this. I am working on a very large code base. Its rare applications stick out like a sore thumb and do not seem justified.

You need to know about it so that you know it when you see it.

PEP 8

PEP 8 , the Python Standard Library style guide, currently says (abbreviated):

There is some debate about using __names .
If your class is intended to be a subclass, and you have attributes that you do not want to use subclasses, think of their names with two leading underscores and missing underscores.
Note that only the name of a simple class is used in the changed name, so if a subclass selects the same class name and attribute name, you can still get name collisions.
__getattr__() can perform certain functions, such as debugging and __getattr__() , less conveniently. However, the name change algorithm is well documented and easy to execute manually.
Not everyone loves name manipulation. Try to balance the need to avoid accidental name conflicts with potential use by leading subscribers.

How it works?

If you add two underscores (without double underscores) in the class definition, the name will be distorted, and the underscore followed by the class name will be added to the object:

 >>> class Foo(object): ... __foobar = None ... _foobaz = None ... __fooquux__ = None ... >>> [name for name in dir(Foo) if 'foo' in name] ['_Foo__foobar', '__fooquux__', '_foobaz']

Note that names are only processed when parsing a class definition:

 >>> Foo.__test = None >>> Foo.__test >>> Foo._Foo__test Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: type object 'Foo' has no attribute '_Foo__test'

In addition, those recently introduced to Python sometimes have trouble understanding what happens when they cannot manually get the name that they see in the class definition. This is not a serious reason for this, but it must be taken into account if you have a training audience.

One underscore?

If the agreement uses only one underscore, I would also like to know the rationale.

When I want users to not turn off their attributes, I use only one underscore, but that is because in my mental model subclasses would have access to the name (which they always have, since they can easily detect a malformed name anyway).

If I were looking at code that uses the __ prefix, I would ask why they call the name and if they cannot do the same thing with one underscore, bearing in mind that if subclasses choose the same name for the class and class attribute despite this, there will be a clash of names.

+14

Aaron Hall Jan 20 '16 at

source share

You should not start with private data and publish it as needed. Rather, you should start by figuring out the interface of your object. That is, you must start by figuring out what the world sees (public things), and then figuring out what private things are needed for this.

Another language makes it difficult to make private what was once public. That is, I will break a lot of code if I make my variable private or protected. But with properties in python this is not the case. Rather, I can support the same interface even when rearranging internal data.

The difference between _ and __ is that python is actually trying to apply the latter. Of course, this is not very difficult, but it interferes. Having _ just tells other programmers what the intent is, they can ignore their danger. But ignoring this rule is sometimes useful. Examples include debugging, temporary hacks, and working with third-party code that is not intended to be used the way you use it.

+9

Winston Ewert Sep 17 '11 at 18:55

source share

There are already many good answers to this, but I am going to offer one more. It is also partly a response to people who keep saying that double underlining is not private (this is true).

If you look at Java / C #, they both have private / protected / public. These are all compile-time constructs . They apply only at compile time. If you want to use reflection in Java / C #, you can easily access the private method.

Now every time you call a function in Python, you essentially use reflection. These code snippets are the same for Python.

 lst = [] lst.append(1) getattr(lst, 'append')(1)

The dot syntax is just the syntax sugar for the last part of the code. Mostly because using getattr is already ugly with just one function call. It only gets worse from there.

Thus, there cannot be a Java / C # private version, since Python does not compile the code. Java and C # cannot check if a function is private or public at runtime because this information is missing (and it does not know where the function is called from).

Now with this information, the name that defines the double underscore makes the most sense to achieve "private." Now, when the function is called from the "self" instance, and it notices that it starts with "__", it just executes the name that controls right there. It is simply more syntactic sugar. This syntactic sugar allows the equivalent of 'private' in a language that uses reflection only to access a data member.

Disclaimer: I have never heard any of the Python developers say something like this. The real reason for not having "private" is cultural, but you will also notice that most scripts / interpreted languages are not personal. Strictly forced closures are nothing but compile time.

+6

Jonathan Sternberg Sep 18 '11 at 3:24 a.m.

source share

First: Why do you want to hide your data? Why is this so important?

In most cases, you really do not want to do this, but you do because others do.

If you really really don’t want people to use something, add an underscore one in front of it. This ... Pythonistas knows that things with one underscore do not guarantee performance every time and can change without your knowledge.

The way we live, and we are fine with that.

Using two underscores will make your class so bad for a subclass that even you don't want to work that way.

+4

JBernardo Sep 17 '11 at 18:34

source share

The selected answer explains well how properties eliminate the need for private attributes, but I would also add that functions at the module level eliminate the need for private methods.

If you turn a method into a function at the module level, you remove the ability to override subclasses. Moving some functionality to the module level is more Pythonic than trying to hide methods with mangling.

+4

Tanner_Wauchope Sep 20 '15 at 23:33

source share

The following code snippet will explain all the different cases:

two leading underscores (__a)
single underline (_a)

no underscore (a)

 class Test: def __init__(self): self.__a = 'test1' self._a = 'test2' self.a = 'test3' def change_value(self,value): self.__a = value return self.__a

print all valid attributes of the test object

 testObj1 = Test() valid_attributes = dir(testObj1) print valid_attributes ['_Test__a', '__doc__', '__init__', '__module__', '_a', 'a', 'change_value']

Here you can see that the name __a has been changed to _Test__a so that this variable is not overridden by any subclass. This concept is called "Name Mangling" in python. You can access this as follows:

 testObj2 = Test() print testObj2._Test__a test1

Similarly, in the case of _a, the variable should only notify the developer that it should be used as an internal variable of this class, the python interpreter will not do anything even if you get it, but this is not a good practice.

 testObj3 = Test() print testObj3._a test2

a variable can be accessed from anywhere in the world, like an open class variable.

 testObj4 = Test() print testObj4.a test3

Hope the answer helped you :)

+3

Nitish Chauhan Apr 15 '18 at 11:40

source share

At first glance, it should be the same as for other languages (by "other" I mean Java or C ++), but this is not so.

In Java, you made private all variables that should not be accessible externally. At the same time, you cannot achieve this in Python, because there is no “privacy” (as one of the principles of Python says, “We are all adults”). Thus, double underlining only means "Guys, do not use this field directly." The same meaning has one underscore, which at the same time does not cause a headache when you need to inherit from the class in question (just an example of a possible problem caused by double underlining).

So, I recommend you use one default underscore for "private" members.

+2

Roman Bodnarchuk Sep 17 '11 at 18:22

source share

"If you are in doubt about whether a variable should be private or protected, it’s better to switch from private." - yes, the same thing in Python.

"", . Python PEP 8 :

, ; , .

Python . ,

"", Python ( ).

0

Yaroslav Nikitenko 18 . '19 14:02

source share

. , , , https://dbader.org/blog/meaning-of-underscores-in-python

This has been explained very simply.

Thank you

-one

Dikshit Kathuria Dec 03 '18 at 11:16

source share

brandizzi · Accepted Answer · 2011-09-17 18:16

If in doubt, leave this "public" - I mean, do not add anything to hide the name of your attribute. If you have a class with some internal value, don't worry about that. Instead of writing:

class Stack(object): def __init__(self): self.__storage = [] # Too uptight def push(self, value): self.__storage.append(value)

write this by default:

 class Stack(object): def __init__(self): self.storage = [] # No mangling def push(self, value): self.storage.append(value)

This is probably a controversial way of doing things. Python newbies just hate it, and even some old Python guys despise this default value, but it's the default anyway, so I highly recommend you follow it even if you feel awkward.

If you really want to send the message "Can't touch this!" for your users, the usual way is to prefix the variable with one underscore. This is just an agreement, but people understand it and take double care when dealing with such things:

 class Stack(object): def __init__(self): self._storage = [] # This is ok but pythonistas use it to be relaxed about it def push(self, value): self._storage.append(value)

It can also be useful to avoid a conflict between property names and attribute names:

  class Person(object): def __init__(self, name, age): self.name = name self._age = age if age >= 0 else 0 @property def age(self): return self._age @age.setter def age(self, age): if age >= 0: self._age = age else: self._age = 0

What about double underlining? So, double underscore magic is mainly used to avoid accidental overloading of methods and name conflicts with attributes of superclasses . This can be very useful if you write a class that is expected to be extended multiple times.

If you want to use it for other purposes, you can, but this is neither ordinary nor recommended.

EDIT : Why is this so? Well, the usual Python style does not emphasize the need to make things private - on the contrary! There are many reasons for this - most of them are contradictory ... Let's look at some of them.

Python has properties

Today, most OO languages use the opposite approach: what should not be used should not be visible, so attributes should be private. Theoretically, this would lead to more manageable, less connected classes, because no one could recklessly change values inside objects.

However, this is not so simple. For example, Java classes have many attributes and retrieval methods that simply get values and setter methods that just set values. You need, say, seven lines of code to declare a single attribute - which, as the Python programmer would say, is unnecessarily complicated. In addition, in practice, you simply write all this code to get one open field, since you can change its value using the get and set methods.

So why follow this default policy? Just make your attributes public by default. Of course, this is problematic in Java, because if you decide to add some validation to your attribute, you will need to change everything

 person.age = age;

in your code let's say

 person.setAge(age);

setAge() creature:

 public void setAge(int age) { if (age >= 0) { this.age = age; } else { this.age = 0; } }

Thus, in Java (and other languages), getters and setters are still used by default, because they can be annoying, but can save a lot of time if you find yourself in the situation I described.

However, you do not need to do this in Python, as Python has properties. If you have this class:

  class Person(object): def __init__(self, name, age): self.name = name self.age = age

and then you decide to confirm the age, you do not need to change parts of the person.age = age code. Just add a property (as shown below)

  class Person(object): def __init__(self, name, age): self.name = name self._age = age if age >= 0 else 0 @property def age(self): return self._age @age.setter def age(self, age): if age >= 0: self._age = age else: self._age = 0

If you can do this and still use person.age = age , why add personal fields, retrieval and installation methods?

(Also see Python not Java and this article about the dangers of using getters and setters .).

In any case, everything is visible - and trying to hide only complicates your work.

Even in languages where there are personal attributes, you can access them through some kind of reflection / introspection library. And people do this a lot, within the framework and for solving urgent needs. The problem is that introspection libraries are just a tricky way to do what you could do with public attributes.

Since Python is a very dynamic language, adding this burden to your classes is simply counterproductive.

The problem is impossible to see - it is needed to see

For Pythonista, encapsulation is not an inability to see the internals of classes, but the ability to avoid viewing them. I mean that encapsulation is a property of a component that allows you to use it without worrying about the internal details of the user. If you can use a component without worrying about its implementation, then it is encapsulated (according to the Python programmer).

Now, if you have written your class this way, you can use it without thinking about the implementation details, there will be no problems if you want to look inside the class for some reason. The point is, your API should be good, and everything else should be the details.

Guido said so

Well, this is not controversial: he said so, really . (Look for the "open kimono.")

It's a culture

Yes, there are several reasons, but not critical. This is basically the cultural aspect of Python programming. Honestly, this may be another way, but it is not. In addition, you could just as easily ask another question: why do some languages use private attributes by default? For the same main reason as for Python practice: because it is the culture of these languages, and each choice has its advantages and disadvantages.

Since this culture already exists, you are advised to follow it. Otherwise, you will be annoyed by Python programmers who will tell you to remove __ from your code when you ask a question in Stack Overflow :)

Python name mangling

Python has properties

In any case, everything is visible - and trying to hide only complicates your work.

The problem is impossible to see - it is needed to see

Guido said so

It's a culture

First - What is called mangling?

When do you doubt that?

PEP 8

How it works?

One underscore?

More articles: