Optimization for PyPy

(This is a continuation of Statistical Profiler for PyPy )

I am running some Python code under PyPy and would like to optimize it.

In Python, I would use statprof or lineprofiler to find out which exact lines slow down and try to get around them. However, in PyPy, both of these tools do not give reliable results, since PyPy can optimize some lines. I would also prefer not to use cProfile , since it is very difficult for me to overtake which part of the reported function is the bottleneck.

Does anyone have any tips on how to proceed? Maybe another profiler that works great under PyPy? In general, how can I optimize Python code for PyPy?

+7
source share
1 answer

If you understand how PyPy architecture works, you will realize that trying to define individual lines of code is not very productive. You start with the Python interpreter, written in RPython, which is then run through a JIT trace that generates flow graphs and then converts these graphs to optimize the RPython interpreter. This means that the layout of your Python code, executed by the RPython interpreter, JIT'ed may have a completely different structure than the optimized assembler actually runs. Also, keep in mind that JIT always works on a loop or function, so getting linear statistics is not that significant. Therefore, I think cProfile may indeed be a good option for you, as it will give you an idea of ​​where to focus on optimization. Once you know which functions are your bottlenecks, you can spend your optimization efforts on these slow functions, instead of trying to fix one line of Python code.

Keep in mind that PyPy has very different performance characteristics than cPython. Always try to write code as simple as possible (this does not mean as few lines as possible). There are several other heuristics that help, for example, use custom lists using objects over dicts when you have a small number of mostly constant keys, avoiding C extensions using the C Python API, etc.

If you really, really insist on trying to optimize at the line level. There are several options. One of them is called JitViewer ( https://bitbucket.org/pypy/jitviewer ), which allows you to have a low level idea of ​​what JIT does with your code. For example, you can even see assembler instructions that correspond to the Python loop. Using this tool, you can really understand how quickly PyPy will behave with certain parts of the code, since now you can do stupid things, for example, count the number of assembler instructions used for your loop or something like that.

+5
source

All Articles