Python: inadvertently modifies parameters passed to a function

Several times I accidentally modified function input. Since Python has no permalinks, I wonder what coding methods can help me avoid a too-frequent error?

Example:

class Table: def __init__(self, fields, raw_data): # fields is a dictionary with field names as keys, and their types as value # sometimes, we want to delete some of the elements for field_name, data_type in fields.items(): if some_condition(field_name, raw_data): del fields[field_name] # ... # in another module # fields is already initialized here to some dictionary table1 = Table(fields, raw_data1) # fields is corrupted by Table __init__ table2 = Table(fields, raw_data2) 

Of course, the fix is ​​to make a copy of the parameter before I change it:

  def __init__(self, fields, raw_data): fields = copy.copy(fields) # but copy.copy is safer and more generally applicable than .copy # ... 

But so easy to forget.

I half think to make a copy of each argument at the beginning of each function if the argument does not potentially relate to a large dataset, which can be expensive to copy, or if the argument is not intended to be modified. This would almost fix the problem, but it would lead to a significant amount of useless code at the beginning of each function. In addition, this would significantly replace the Python approach for passing parameters by reference, which was supposedly done for some reason.

+7
source share
5 answers

First rule: do not change containers: create new ones.

Therefore, do not change your incoming dictionary, create a new dictionary with a subset of keys.

 self.fields = dict( key, value for key, value in fields.items() if accept_key(key, data) ) 

Such methods, as a rule, are slightly more effective than passing and removing bad elements. More generally, it is often easier to avoid modifying objects and creating new ones instead.

The second general rule: do not change containers after they are turned off.

Normally, you cannot assume that the containers to which you transferred the data made their own copies. As a result, do not try to modify the containers that you specified. Any changes must be made before data transfer. As soon as you transfer the container to someone else, you are no longer its sole owner.

The third general rule: do not modify containers that you have not created.

If you received a container, you do not know who else can use the container. Therefore, do not change it. Either use the unmodified version or call rule1 by creating a new container with the desired changes.

Fourth General Rule: (stolen from Ethan Furman)

Some functions must change the list. This is their job. If so, make it visible in the function name (for example, add and continue list methods).

Putting it all together:

A piece of code should only modify a container if it is the only piece of code with access to that container.

+8
source

Making copies of the parameters "anyway" is a bad idea: you end up paying for it in lousy performance; or you need to track the size of your arguments.

It's better to get a good idea of ​​objects and names and how Python deals with them. A good start is in this article .

The import point is

 def modi_list(alist): alist.append(4) some_list = [1, 2, 3] modi_list(some_list) print(some_list) 

has exactly the same effect as

 some_list = [1, 2, 3] same_list = some_list same_list.append(4) print(some_list) 

because there is no copying of arguments in the function call, no creation pointers are created ... what happens is Python, saying alist = some_list , and then executing the code in the modi_list() function. In other words, Python binds (or assigns) a different name to the same object.

Finally, when you have a function that will change its arguments, and you do not want these changes to be visible outside the function, you can simply make a shallow copy:

 def dont_modi_list(alist): alist = alist[:] # make a shallow copy alist.append(4) 

Now some_list and alist are two different list objects that contain the same objects, so if you just mess around with the list object (insert, delete, rearrange), then you're fine, buf, if you go even deeper and cause changes to the objects in the list, then you will need to do deepcopy() . But it is up to you to track such things and code accordingly.

+3
source

You can use the metaclass as follows:

 import copy, new class MakeACopyOfConstructorArguments(type): def __new__(cls, name, bases, dct): rv = type.__new__(cls, name, bases, dct) old_init = dct.get("__init__") if old_init is not None: cls.__old_init = old_init def new_init(self, *a, **kw): a = copy.deepcopy(a) kw = copy.deepcopy(kw) cls.__old_init(self, *a, **kw) rv.__init__ = new.instancemethod(new_init, rv, cls) return rv class Test(object): __metaclass__ = MakeACopyOfConstructorArguments def __init__(self, li): li[0]=3 print li li = range(3) print li t = Test(li) print li 
+1
source

Python has best practice and is called unit testing.

The main thing is that dynamic languages ​​allow very rapid development even with full unit testing; and unit tests are much tougher than static typing.
As Martin Fowler writes:

The general argument for static types is that it catches errors that are otherwise hard to find. But I found that in the presence of SelfTestingCode, most errors that static types would be found just as easily with tests.

+1
source

As @delnan notes, the easiest solution is to always pass immutable. You can also wrap variables in a custom persistent object.

Python: any way to declare constant parameters?

0
source

All Articles