Cython string concatenation is very slow; what else is doing badly?

I have a large Python code base that we recently started compiling with Cython. Without making any changes to the code, I expected the performance to remain approximately the same, but we planned to optimize the heavier calculations using special Cython code after profiling. However, the speed of the compiled application fell sharply and, apparently, is everywhere. Methods take from 10% to 300% longer than before.

I played with test code to try and find things that Cython does poorly, and it seems like one of them is string manipulation. My question is, am I doing something wrong, or is Cython really just bad in some things? Can you help me understand why this is so bad, and what else can make Keaton really bad?

EDIT: Let me try to clarify. I understand that this type of string concatenation is very bad; I just noticed that it has a huge speed difference, so I posted it (maybe a bad idea). The codebase doesn't have such awful code, but it still slows down a lot, and I hope for pointers to which types of Cython constructs do poorly, so I can figure out where to look. I tried profiling, but this is not particularly useful.

For reference, here is my string manipulation test code. I understand that the code below is horrible and useless, but I'm still shocked by the speed difference.

# pyCode.py
def str1():
    val = ""
    for i in xrange(100000):
        val = str(i)

def str2():
    val = ""
    for i in xrange(100000):
        val += 'a'

def str3():
    val = ""
    for i in xrange(100000):
        val += str(i)

Time code

# compare.py
import timeit

pyTimes = {}
cyTimes = {}

# STR1
number=10

setup = "import pyCode"
stmt = "pyCode.str1()"
pyTimes['str1'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

setup = "import cyCode"
stmt = "cyCode.str1()"
cyTimes['str1'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

# STR2
setup = "import pyCode"
stmt = "pyCode.str2()"
pyTimes['str2'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

setup = "import cyCode"
stmt = "cyCode.str2()"
cyTimes['str2'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

# STR3
setup = "import pyCode"
stmt = "pyCode.str3()"
pyTimes['str3'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

setup = "import cyCode"
stmt = "cyCode.str3()"
cyTimes['str3'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

for funcName in sorted(pyTimes.viewkeys()):
    print "PY {} took {}s".format(funcName, pyTimes[funcName])
    print "CY {} took {}s".format(funcName, cyTimes[funcName])

Compiling a Cython Module with

cp pyCode.py cyCode.py
cython cyCode.py
gcc -O2 -fPIC -shared -I$PYTHONHOME/include/python2.7 \
    -fno-strict-aliasing -fno-strict-overflow -o cyCode.so cyCode.c

Resulting Timings

> python compare.py 
PY str1 took 0.1610019207s
CY str1 took 0.104282140732s
PY str2 took 0.0739600658417s
CY str2 took 2.34380102158s
PY str3 took 0.224936962128s
CY str3 took 21.6859738827s

For reference, I tried this with Cython 0.19.1 and 0.23.4. I compiled C code with gcc 4.8.2 and icc 14.0.2, trying to use different flags with both.

+4
source share
2 answers

Worth reading: Pep 0008> Programming Guidelines:

The code should be written in such a way that it does not prejudice other Python implementations (PyPy, Jython, IronPython, Cython, Psyco, etc.).

, CPython a + = b = a + b. CPython ( ) , refcounting. .join(). , .

: https://www.python.org/dev/peps/pep-0008/#programming-recommendations

+4

; ( , , , ), Cython , .

: " ". ( , ). Cython, , , list str, ''.join(listofstr) , str.

, Cython - , . , , . , cdef ''.join:

cpdef str2():
    cdef int i
    val = []
    for i in xrange(100000):  # Maybe range; Cython docs aren't clear if xrange optimized
        val.append('a')
    val = ''.join(val)
+2

All Articles