In the first example, it is pretty clear that this is a case of optimizing a single link (in fact, there are two links: one of the object itself and one LOAD_FAST ; unicode_concatenate will try to reduce it to 1 before passing PyUnicode_Append control) made by CPython using this unicode_modifiable function:
static int unicode_modifiable(PyObject *unicode) { assert(_PyUnicode_CHECK(unicode)); if (Py_REFCNT(unicode) != 1) return 0; if (_PyUnicode_HASH(unicode) != -1) return 0; if (PyUnicode_CHECK_INTERNED(unicode)) return 0; if (!PyUnicode_CheckExact(unicode)) return 0; #ifdef Py_DEBUG assert(!unicode_is_singleton(unicode)); #endif return 1; }
But in the second case, the instance data is stored in a Python dict , and not in a simple variable, so things are slightly different.
a.accum_ += 'foo'
really requires pre-fetching the value of a.accum_ and storing it on the stack. So now the line has at least three references: one from the instance dictionary, one from DUP_TOP and one from PyObject_GetAttr used by LOAD_ATTR . Therefore, Python cannot optimize this case, since changing one of them in place will affect other links.
>>> class A: pass ... >>> a = A() >>> def func(): a.str = 'spam' print a.str return '_from_func' ... >>> a.str = 'foo' >>> a.str += func() spam
You expect the output here to be 'spam_from_func' , but it will be different because the original value of a.str was saved by Python before func() was called.
>>> a.str 'foo_from_func'
Bytecode:
>>> import dis >>> def func_class(): a = Foo() a.accum = '' a.accum += 'zzzzz\n' ... >>> dis.dis(func_class) 2 0 LOAD_GLOBAL 0 (Foo) 3 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 6 STORE_FAST 0 (a) 3 9 LOAD_CONST 1 ('') 12 LOAD_FAST 0 (a) 15 STORE_ATTR 1 (accum) 4 18 LOAD_FAST 0 (a) 21 DUP_TOP 22 LOAD_ATTR 1 (accum) 25 LOAD_CONST 2 ('zzzzz\n') 28 INPLACE_ADD 29 ROT_TWO 30 STORE_ATTR 1 (accum) 33 LOAD_CONST 0 (None) 36 RETURN_VALUE
Please note that this optimization was performed in about 2004 (CPython 2.4) so ββthat users are not slowness a += b or a = a + b , therefore it is mainly intended for simple variables and only works if the following instructions STORE_FAST (local variable), STORE_DEREF (closures) and STORE_NAME . This is not a general solution, the best way to do it in Python is to create a list and combine its elements using str.join .
CPython implementation details . If s and t are both strings, some Python implementations, such as CPython, can usually perform in-place optimizations for s = s + t or s += t assignments. when applicable this optimization makes quadratic run time much less likely. This optimization is both a version and an implementation dependent. For performance-sensitive code, it is preferable to use str.join() , which provides consistent linear concatenation of performance across versions and implementations.