Slicing pandas DataFrame with negative index using ix () method

Question

Slicing pandas DataFrame with negative index using ix () method

DataFrame.ix () doesn't seem to break the DataFrame I want when using negative indexing.

I have a DataFrame object and you want to slice the last 2 rows.

In [90]: df = pd.DataFrame(np.random.randn(10, 4)) In [91]: df Out[91]: 0 1 2 3 0 1.985922 0.664665 -2.800102 1.695480 1 0.580509 0.782473 1.032970 1.559917 2 0.584387 1.798743 0.095950 0.071999 3 1.956221 0.075530 -0.391008 1.692585 4 -0.644979 -1.959265 0.749394 -0.437995 5 -1.204964 0.653912 -1.426602 2.409855 6 1.178886 2.177259 -0.165106 1.145952 7 1.410595 -0.761426 -1.280866 0.609122 8 0.110534 -0.234781 -0.819976 0.252080 9 1.798894 0.553394 -1.358335 1.278704

One way to do this:

  In [92]: df[-2:] Out[92]: 0 1 2 3 8 0.110534 -0.234781 -0.819976 0.252080 9 1.798894 0.553394 -1.358335 1.278704

An inquisitive way to do this:

  In [93]: df.ix[len(df)-2:, :] Out[93]: 0 1 2 3 8 0.110534 -0.234781 -0.819976 0.252080 9 1.798894 0.553394 -1.358335 1.278704

Now I want to use negative indexing, but I have a problem:

  In [94]: df.ix[-2:, :] Out[94]: 0 1 2 3 0 1.985922 0.664665 -2.800102 1.695480 1 0.580509 0.782473 1.032970 1.559917 2 0.584387 1.798743 0.095950 0.071999 3 1.956221 0.075530 -0.391008 1.692585 4 -0.644979 -1.959265 0.749394 -0.437995 5 -1.204964 0.653912 -1.426602 2.409855 6 1.178886 2.177259 -0.165106 1.145952 7 1.410595 -0.761426 -1.280866 0.609122 8 0.110534 -0.234781 -0.819976 0.252080 9 1.798894 0.553394 -1.358335 1.278704

How to use negative indexing correctly with DataFrame.ix ()? Thank.

+5

pandas slice indexing dataframe

Julia He Dec 26

source share

2 answers

Wes McKinney · Answer 1 · 2012-12-27 00:12

This is mistake:

 In [1]: df = pd.DataFrame(np.random.randn(10, 4)) In [2]: df Out[2]: 0 1 2 3 0 -3.100926 -0.580586 -1.216032 0.425951 1 -0.264271 -1.091915 -0.602675 0.099971 2 -0.846290 1.363663 -0.382874 0.065783 3 -0.099879 -0.679027 -0.708940 0.138728 4 -0.302597 0.753350 -0.112674 -1.253316 5 -0.213237 -0.467802 0.037350 0.369167 6 0.754915 -0.569134 -0.297824 -0.600527 7 0.644742 0.038862 0.216869 0.294149 8 0.101684 0.784329 0.218221 0.965897 9 -1.482837 -1.325625 1.008795 -0.150439 In [3]: df.ix[-2:] Out[3]: 0 1 2 3 0 -3.100926 -0.580586 -1.216032 0.425951 1 -0.264271 -1.091915 -0.602675 0.099971 2 -0.846290 1.363663 -0.382874 0.065783 3 -0.099879 -0.679027 -0.708940 0.138728 4 -0.302597 0.753350 -0.112674 -1.253316 5 -0.213237 -0.467802 0.037350 0.369167 6 0.754915 -0.569134 -0.297824 -0.600527 7 0.644742 0.038862 0.216869 0.294149 8 0.101684 0.784329 0.218221 0.965897 9 -1.482837 -1.325625 1.008795 -0.150439

https://github.com/pydata/pandas/issues/2600

Note that df[-2:] will work:

 In [4]: df[-2:] Out[4]: 0 1 2 3 8 0.101684 0.784329 0.218221 0.965897 9 -1.482837 -1.325625 1.008795 -0.150439

Zelazny7 · Answer 2 · 2012-12-26 04:14

ix The main goal is to enable numpy-like indexing, with support for row and column labels. Therefore, I am not sure that your use case is the intended goal. Here are a few ways that I can think of, mostly trivial:

 In [142]: df.ix[:][-2:] Out[142]: 0 1 2 3 8 0.386882 -0.836112 -0.108250 -0.433797 9 0.642468 -0.399255 -0.911456 -0.497720 In [161]: df.ix[df.index[-2:],:] Out[161]: 0 1 2 3 8 0.386882 -0.836112 -0.108250 -0.433797 9 0.642468 -0.399255 -0.911456 -0.497720

I do not think that ix generally supports negative indexing. It seems to be just ignoring this:

 In [181]: df.ix[-100:,:] Out[181]: 0 1 2 3 0 -1.144137 -1.042034 -2.158838 0.674055 1 -0.424184 1.237318 -1.846130 0.575357 2 -0.844974 -0.541060 2.197364 -0.031898 3 0.846263 1.244450 -1.570566 -0.477919 4 -0.193445 0.171045 -0.235587 -1.185583 5 1.361539 -1.107389 -1.321081 -0.776407 6 0.505907 -1.364414 -2.093770 0.144016 7 -0.888465 -0.329153 0.491264 -0.363472 8 0.386882 -0.836112 -0.108250 -0.433797 9 0.642468 -0.399255 -0.911456 -0.497720

Edit: from the pandas documentation we have:

Index-based indexing using integer-axis labels is a complex topic. This has been widely discussed on mailing lists and among various members of the Python scientific community. In pandas, our common view is that labels matter more than whole locations. Therefore, with the whole axis index, only label-based indexing can be used with standard tools such as .ix. The following code throws exceptions:
 s = Series(range(5)) s[-1] df = DataFrame(np.random.randn(5, 4)) df df.ix[-2:] 
This intentional decision was made to prevent ambiguities and subtleties (many users have reported errors when changing the API; stop "backing down" from position-based indexing).

Slicing pandas DataFrame with negative index using ix () method

More articles: