Is MATLAB faster than Python (a small simple experiment)

I read this ( Is MATLAB faster than Python? ) And I find that it has a lot of ifs.

I tried this little experiment on an old computer that still works on Windows XP.

In MATLAB R2010b, I copied and pasted the following code into the command window:

tic x = 0.23; for i = 1:100000000 x = 4 * x * (1 - x); end toc x 

The result is:

 Elapsed time is 0.603583 seconds. x = 0.947347510922557 

Then I saved the py file with the following script:

 import time t = time.time() x = 0.23 for i in range(100000000): x = 4 * x * (1 - x) elapsed = time.time() - t print(elapsed) print(x) 

I hit F5 and the result was

 49.78125 0.9473475109225565 

MATLAB took 0.60 seconds; in Python it took 49.78 seconds (eternity !!).

So the question is : is there an easy way to make Python as fast as MATLAB?

In particular : how can I change my py script so that it runs as fast as MATLAB?


UPDATE

I tried the same experiment in PyPy (copying and pasting the same code as above): it did it in 1.0470001697540283 seconds on the same computer as before.

I repeated experiments with 1e9 loops.

MATLAB Results:

 Elapsed time is 5.599789 seconds. 1.643573442831396e-004 

PyPy results:

 8.609999895095825 0.00016435734428313955 

I also tried with the usual while with similar results:

 t = time.time() x = 0.23 i = 0 while (i < 1000000000): x = 4 * x * (1 - x) i += 1 elapsed = time.time() - t elapsed x 

results

 8.218999862670898 0.00016435734428313955 

I am going to try NumPy after a while.

+5
source share
3 answers

Firstly, using time not a good way to test such code. But let him ignore it.


When you have code that runs many loops and repeats a very similar job every time through the loop, PyPy JIT will do an excellent job. When this code does the same thing every time, to constant values ​​that can be taken out of the loop, it will be even better. CPython, on the other hand, has to execute several bytecodes for each iteration of the loop, so it will be slow. From a quick test on my machine, CPython 3.4.1 takes 24.2 seconds, but PyPy 2.4.0 / 3.2.5 takes 0.0059 seconds.

IronPython and Jython also compiled JIT (although they use the more general JVM and .NET JIT), so they are usually faster than CPython for this kind of work.


You can also speed up work like this in CPython itself by using NumPy arrays and vector operations instead of Python lists and loops. For example, the following code takes 0.011 seconds:

 i = np.arange(10000000) i[:] = 4 * x * (1-x) 

Of course, in this case, we simply calculate the value once and copy it 10,000,000 times. But we can get it to actually compute again and again, and it still only takes 0.12 seconds:

 i = np.zeros((10000000,)) i = 4 * (x+i) * (1-(x+i)) 

Other options include writing a piece of code in Cython (which compiles into the C extension for Python) and using Numba , which JIT compiles code inside CPython. For such toy programs, none of them can be suitable - the time taken to automatically generate and compile the C code can put out the time saved when running the C code instead of the Python code, if you are only trying to optimize the one-time 24-second process, But in real numerical programming both of them are very useful. (And both play well with NumPy.)

And there are always new projects on the horizon.

+11
source

A (somewhat educated) hunch is that python does not execute the deployment loop in your code, but MATLAB does . This means that the MATLAB code performs one large calculation, and not many (!) Smaller ones. This is the main reason to migrate from PyPy, not CPython, because PyPy performs a circular scan .

If you are using python 2.X, you must substitute range for xrange , since range (in python 2.X) creates a list for iteration.

+4
source

Q: how do I change my py script so that it runs as fast as MATLAB?

as abarnet has already given you many knowledgeable directions, let me add my two cents (and some quantitative results).

(similarly, I hope you forgive skipping for: and assume a more complex computational task)

  • review the code for any possible algorithmic improvements, value reuse, and case / cache compatible ( numpy.asfortranarray() , etc.)

  • use vector execution code / loop deployment in numpy where possible

  • use the numba LLVM compiler for the stable parts of your code

  • use additional (JIT) compilers tricks (nogil = True, nopython = True) only for the final evaluation of the code to avoid the common error of premature optimization

The achievements that are possible are really huge:

Where nanoseconds matter

Sample source code is taken from the FX arena (where milliseconds, microseconds and (wasted) nanoseconds really matter - check that for 50% market events you have much less than 900 milliseconds to act (pass-through bi-directional transaction), not speaking of HFT ...) for processing EMA(200,CLOSE) - a non-trivial exponential moving average for the last 200 GBPUSD of candles / bars in an array of about 5200+ lines:

 import numba #@jit # 2015-06 @autojit deprecated @numba.jit('f8[:](i8,f8[:])') def numba_EMA_fromPrice( N_period, aPriceVECTOR ): EMA = aPriceVECTOR.copy() alf = 2. / ( N_period + 1 ) for aPTR in range( 1, EMA.shape[0] ): EMA[aPTR] = EMA[aPTR-1] + alf * ( aPriceVECTOR[aPTR] - EMA[aPTR-1] ) return EMA 

For this "classic" code, only the numba compilation step numba made an improvement over the usual execution of python / numpy code

21x to about half a millisecond

 # 541L 

from about 11,499 [us] (yes, from about 11,500 microseconds to just 541 [us])

 # classical numpy # aClk.start();X[:,7] = EMA_fromPrice( 200, price_H4_CLOSE );aClk.stop() # 11499L 

But, if you are more careful about the algorithm and redesign it to work smarter and more efficiently, the results are even more fruitful

 @numba.jit def numba_EMA_fromPrice_EFF_ALGO( N_period, aPriceVECTOR ): alfa = 2. / ( N_period + 1 ) coef = ( 1 - alfa ) EMA = aPriceVECTOR * alfa EMA[1:]+= EMA[0:-1] * coef return EMA # aClk.start();numba_EMA_fromPrice_EFF_ALGO( 200, price_H4_CLOSE );aClk.stop() # Out[112]: 160814L # JIT-compile-pass # Out[113]: 331L # re-use 0.3 [ms] v/s 11.5 [ms] CPython # Out[114]: 311L # Out[115]: 324L 

And final shutdown when polishing to handle multiple processors


46x accelerated to about a quarter of a millisecond

 # ___________vvvvv__________# !!! !!! #@numba.jit( nogil = True ) # JIT w/o GIL-lock w/ multi-CORE ** WARNING: ThreadSafe / DataCoherency measures ** # aClk.start();numba_EMA_fromPrice_EFF_ALGO( 200, price_H4_CLOSE );aClk.stop() # Out[126]: 149929L # JIT-compile-pass # Out[127]: 284L # re-use 0.3 [ms] v/s 11.5 [ms] CPython # Out[128]: 256L 

As a final bonus. Faster is not the same as better.

Surprised?

No, that’s nothing strange. Try making MATLAB calculate SQRT (2) with an accuracy of 500,000,000 places behind the decimal point. Here it is.

Nanoseconds matter. The more here where accuracy is the goal.


Isn't it worth the time and effort? Of course it is.

0
source

All Articles