Pandas: replace with .ix

Question

Pandas: replace with .ix

Given the pandas 0.20.0 update and the obsolete .ix , I'm wondering what is the most efficient way to get the same result using the remaining .loc and .iloc . I just answered this question , but the second option (without using .ix ) seems inefficient and verbose.

Snippet:

 print df.iloc[df.loc[df['cap'].astype(float) > 35].index, :-1]

Is this the right way to use both conditional and index filtering of positions?

+9

python pandas indexing

elPastor May 08 '17 at 2:43

source share

3 answers

Generally, you prefer to avoid chain indexing in pandas (although, strictly speaking, you actually use two different indexing methods). You cannot change your data frame in this way (details in the documents ), and documents indicate performance as another reason (indexing one against two).

For the latter, this is usually insignificant (or rather unlikely to be a bottleneck in your code), and in fact it seems that this is not so (at least in the following example):

 df = pd.DataFrame(np.random.uniform(size=(100000,10)),columns = list('abcdefghij')) # Get columns number 2:5 where value in 'a' is greater than 0.5 # (ie Boolean mask along axis 0, position slice of axis 1) # Deprecated .ix method %timeit df.ix[df['a'] > 0.5,2:5] 100 loops, best of 3: 2.14 ms per loop # Boolean, then position %timeit df.loc[df['a'] > 0.5,].iloc[:,2:5] 100 loops, best of 3: 2.14 ms per loop # Position, then Boolean %timeit df.iloc[:,2:5].loc[df['a'] > 0.5,] 1000 loops, best of 3: 1.75 ms per loop # .loc %timeit df.loc[df['a'] > 0.5, df.columns[2:5]] 100 loops, best of 3: 2.64 ms per loop # .iloc %timeit df.iloc[np.where(df['a'] > 0.5)[0],2:5] 100 loops, best of 3: 9.91 ms per loop

Bottom line: if you really want to avoid .ix and are not going to change the values in your data frame, just .ix chain indexing. On the other hand (the “correct”, but possibly dirty way), if you need to change the values, execute .iloc with np.where() or .loc with integer fragments df.index or df.columns .

+6

Ken wei May 08 '17 at 8:53

source share

How to break this down into two-step indexing:

 df[df['cap'].astype(float) > 35].iloc[:,:-1]

or even:

 df[df['cap'].astype(float) > 35].drop('cap',1)

+3

Psidom May 08 '17 at 2:52

source share

piRSquared · Accepted Answer · 2017-05-08T04:28:44+0000

You can stay in the world of one loc by getting the index values you need, dividing this specific index by position.

 df.loc[ df['cap'].astype(float) > 35, df.columns[:-1] ]

Pandas: replace with .ix

More articles: