The __init__ provisioning is called only once when the class is instantiated by the constructor or __new__

I am trying to understand how to create new instances of the Python class when the creation process can be either through the constructor or using the __new__ method. In particular, I notice that when using the constructor, the __init__ method will be called automatically after __new__ , while when __new__ called directly, the __init__ class will not be called automatically. I can get __init__ be called when __new__ explicitly called by embedding the call in __init__ inside __new__ , but then __init__ will eventually get the call twice when the class is created via the constructor.

For example, consider the following toy class, which stores one internal property, namely a list object called data : it is useful to think of it as the beginning of a vector class.

 class MyClass(object): def __new__(cls, *args, **kwargs): obj = object.__new__(cls, *args, **kwargs) obj.__init__(*args, **kwargs) return obj def __init__(self, data): self.data = data def __getitem__(self, index): return self.__new__(type(self), self.data[index]) def __repr__(self): return repr(self.data) 

A new instance of the class can be created either using the constructor (actually not sure if this is the correct terminology in Python), something like

x = MyClass(range(10))

or through slicing, which you can see, calls the __new__ call in the __new__ method.

x2 = x[0:2]

In the first case, __init__ will be called twice (both through an explicit call inside __new__ , and automatically) and once in the second instance. Obviously, I would like __init__ called once anyway. Is there a standard way to do this in Python?

Note that in my example, I could get rid of the __new__ method and override __getitem__ as

 def __getitem__(self, index): return MyClass(self.data[index]) 

but then it will cause a problem if later I want to inherit from MyClass , because if I make a call like child_instance[0:2] , I will return an instance of MyClass , not a child class.

+8
python
source share
3 answers

Firstly, some basic facts about __new__ and __init__ :

  • __new__ - constructor .
  • __new__ usually returns an instance of cls , its first argument.
  • __new__ returning an instance of cls , __new__ causes Python to call __init__ .
  • __init__ is an initializer . It modifies the instance ( self ) returns __new__ . He does not need to return self .

When MyClass defines:

 def __new__(cls, *args, **kwargs): obj = object.__new__(cls, *args, **kwargs) obj.__init__(*args, **kwargs) return obj 

MyClass.__init__ is called twice. Once from the call to obj.__init__ explicitly, and the second time because __new__ returned obj , an instance of cls . (Since the first argument to object.__new__ is cls , the returned instance is an instance of MyClass , so obj.__init__ calls MyClass.__init__ , not object.__init__ .)


Python 2.2.3 Release Notes contains an interesting commentary that sheds light on when to use __new__ and when to use __init__ :

The __new__ method __new__ called with the class as its first argument; his responsibility is to return a new instance of this class.

Compare this to __init__ : __init__ is called with the instance as its first argument, and it returns nothing; its responsibility to initialize the instance.

All this is done so that immutable types can be immutable when resolving subclasses.

Immutable types (int, long, float, complex, str, unicode and tuple) have dummy __init__ , while mutable types (dict, list, file, as well as super, classmethod, staticmethod and property) have dummy __new__ .

So, use __new__ to define immutable types and use __init__ to define mutable types. Although you can define both, you do not need to do this.


Thus, since MyClass is modified, you should only define __init__ :

 class MyClass(object): def __init__(self, data): self.data = data def __getitem__(self, index): return type(self)(self.data[index]) def __repr__(self): return repr(self.data) x = MyClass(range(10)) x2 = x[0:2] 
+8
source share

There are several things that should not be done:

  • Call __init__ from __new__
  • Call __new__ directly in the method

As you have already seen, when you create an object of this class, the __new__ and __init__ method is automatically called. Using them directly will violate this functionality (although calling __init__ inside another __init__ allowed, as seen in the example below).

You can get the object class in any way by getting the __class__ attribute, as in the following example:

 class MyClass(object): def __new__(cls, *args, **kwargs): # Customized __new__ implementation here return obj def __init__(self, data): super(MyClass, self).__init__(self) self.data = data def __getitem__(self, index): cls = self.__class__ return cls(self.data[index]) def __repr__(self): return repr(self.data) x = MyClass(range(10)) x2 = x[0:2] 
+1
source share

When you create an instance of a class with MyClass(args) , the default instance creation sequence is as follows:

  • MyClass.__new__(args) is called to get a new "empty" instance. Called
  • new_instance.__init__(args) ( new_instance is the instance returned from the __new__ call as described above) to initialize the attributes of the new instance [1]
  • new_instance returned as the result of MyClass(args)

From this it is clear that the call to MyClass.__new__ will not MyClass.__new__ when __init__ called, so you will get an uninitialized instance. It is equally clear that calling __init__ on __new__ also incorrect, since then MyClass(args) will call __init__ twice.

The source of your problem is the following:

I am trying to figure out what new instances of the Python class should look like when the creation process can be either through a constructor or using a new method

The creation process usually should not be performed using the __new__ method. __new__ is part of the protocol for creating a regular instance, so you should not expect it to call the entire protocol for you.

One (bad) solution would be to implement this protocol manually; instead:

 def __getitem__(self, index): return self.__new__(type(self), self.data[index]) 

you may have:

 def __getitem__(self, index): new_item = self.__new__(type(self), self.data[index]) new_item.__init__(self.data[index]) return new_item 

But in fact, what you want to do is not confused with __new__ at all. By default, __new__ is appropriate for your case, and the default instance creation protocol is suitable for you, so you should not implement __new__ or call it directly.

You want to create a new instance of the class in the usual way by calling the class. If there is no inheritance, and you do not think it will ever be, just replace self.__new__(type(self), self.data[index]) with MyClass(self.data[index]) .

If you think that one day there may be subclasses of MyClass that would like to instantiate the subclass through slicing rather than MyClass , then you need to dynamically get the self class and call it. You already know how to do this because you used it in your program! type(self) will return the type (class) of self , which you can then call just like you would call it directly via MyClass : type(self)(self.data[index]) .


As an aside, the __new__ point is when you want to configure the process of getting a β€œnew” empty class instance before initializing it. Almost all the time, this is completely unnecessary, and by default __new__ is fine.

You only need __new__ in two cases:

  • You have an unusual distribution scheme in which you can return an existing instance instead of creating a truly new one (the only way to create a new instance is to delegate the final default implementation __new__ anyway).
  • You implement a subclass of an immutable built-in type. Since immutable built-in types cannot be changed after creation (because they are immutable), they must be initialized as , they are created, and not then in __init__ .

As a generalization of point (1), you can make __new__ return whatever you like (not necessarily an instance of the class) to make the class call behave in some arbitrarily bizarre way. It seems that this will almost always be more confusing than useful.


[1] I find the protocol a little more complicated; __init__ is only called for the value returned by __new__ if it is an instance of the class that was called to start the process. However, this is very unusual for this should not be.

0
source share

All Articles