How to access a numpy array as fast as pandas dataframe

I have compared several ways to access data in DataFrame. See Results below. The fastest access was from using the method get_valueon DataFrame. I was indicated in this post.

I was surprised that access through is get_valuefaster than access through a numpy base object df.values.

Question

My question is, is there a way to access the elements of a numpy array as fast as I can access the pandas file frame through get_value?

Customization

import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(16).reshape(4, 4))

Testing

%%timeit
df.iloc[2, 2]

10,000 cycles, best 3: 108 μs per cycle

%%timeit
df.values[2, 2]

The slowest run took 5.42 times longer than the fastest. This may mean that the intermediate result is cached. 100000, best of 3: 8.02 μs per loop

%%timeit
df.iat[2, 2]

4,96 , . , . 100000 , 3: 9,85

%%timeit
df.get_value(2, 2)

19,29 , . , . 100000, 3: 3,57

+4
1

iloc , , . , , pandas , , iat, , . iat get_value, , , get_value . get_value , .

df.values ndarray, , . , , .

, . , values , , :

In [111]: %timeit df = pd.DataFrame(np.arange(16).reshape(4, 4))
10000 loops, best of 3: 186 µs per loop

In [112]: %timeit df = pd.DataFrame(np.arange(16).reshape(4, 4)); df.values[2,2]
1000 loops, best of 3: 200 µs per loop

In [113]: %timeit df = pd.DataFrame(np.arange(16).reshape(4, 4)); df.get_value(2,2)
1000 loops, best of 3: 309 µs per loop

In [114]: %timeit df = pd.DataFrame(np.arange(16).reshape(4, 4)); df.iat[2,2]
1000 loops, best of 3: 308 µs per loop

In [115]: %timeit df = pd.DataFrame(np.arange(16).reshape(4, 4)); df.iloc[2,2]
1000 loops, best of 3: 420 µs per loop

In [116]: %timeit df = pd.DataFrame(np.arange(16).reshape(4, 4)); df.ix[2,2]
1000 loops, best of 3: 316 µs per loop

, ix , , iloc; , ix, iloc - , .

+2

All Articles