Generally, you prefer to avoid chain indexing in pandas (although, strictly speaking, you actually use two different indexing methods). You cannot change your data frame in this way (details in the documents ), and documents indicate performance as another reason (indexing one against two).
For the latter, this is usually insignificant (or rather unlikely to be a bottleneck in your code), and in fact it seems that this is not so (at least in the following example):
df = pd.DataFrame(np.random.uniform(size=(100000,10)),columns = list('abcdefghij')) # Get columns number 2:5 where value in 'a' is greater than 0.5 # (ie Boolean mask along axis 0, position slice of axis 1) # Deprecated .ix method %timeit df.ix[df['a'] > 0.5,2:5] 100 loops, best of 3: 2.14 ms per loop # Boolean, then position %timeit df.loc[df['a'] > 0.5,].iloc[:,2:5] 100 loops, best of 3: 2.14 ms per loop # Position, then Boolean %timeit df.iloc[:,2:5].loc[df['a'] > 0.5,] 1000 loops, best of 3: 1.75 ms per loop # .loc %timeit df.loc[df['a'] > 0.5, df.columns[2:5]] 100 loops, best of 3: 2.64 ms per loop # .iloc %timeit df.iloc[np.where(df['a'] > 0.5)[0],2:5] 100 loops, best of 3: 9.91 ms per loop
Bottom line: if you really want to avoid .ix and are not going to change the values โโin your data frame, just .ix chain indexing. On the other hand (the โcorrectโ, but possibly dirty way), if you need to change the values, execute .iloc with np.where() or .loc with integer fragments df.index or df.columns .
Ken wei
source share