Slicing pandas DataFrame with negative index using ix () method

DataFrame.ix () doesn't seem to break the DataFrame I want when using negative indexing.

I have a DataFrame object and you want to slice the last 2 rows.

In [90]: df = pd.DataFrame(np.random.randn(10, 4)) In [91]: df Out[91]: 0 1 2 3 0 1.985922 0.664665 -2.800102 1.695480 1 0.580509 0.782473 1.032970 1.559917 2 0.584387 1.798743 0.095950 0.071999 3 1.956221 0.075530 -0.391008 1.692585 4 -0.644979 -1.959265 0.749394 -0.437995 5 -1.204964 0.653912 -1.426602 2.409855 6 1.178886 2.177259 -0.165106 1.145952 7 1.410595 -0.761426 -1.280866 0.609122 8 0.110534 -0.234781 -0.819976 0.252080 9 1.798894 0.553394 -1.358335 1.278704 

One way to do this:

  In [92]: df[-2:] Out[92]: 0 1 2 3 8 0.110534 -0.234781 -0.819976 0.252080 9 1.798894 0.553394 -1.358335 1.278704 

An inquisitive way to do this:

  In [93]: df.ix[len(df)-2:, :] Out[93]: 0 1 2 3 8 0.110534 -0.234781 -0.819976 0.252080 9 1.798894 0.553394 -1.358335 1.278704 

Now I want to use negative indexing, but I have a problem:

  In [94]: df.ix[-2:, :] Out[94]: 0 1 2 3 0 1.985922 0.664665 -2.800102 1.695480 1 0.580509 0.782473 1.032970 1.559917 2 0.584387 1.798743 0.095950 0.071999 3 1.956221 0.075530 -0.391008 1.692585 4 -0.644979 -1.959265 0.749394 -0.437995 5 -1.204964 0.653912 -1.426602 2.409855 6 1.178886 2.177259 -0.165106 1.145952 7 1.410595 -0.761426 -1.280866 0.609122 8 0.110534 -0.234781 -0.819976 0.252080 9 1.798894 0.553394 -1.358335 1.278704 

How to use negative indexing correctly with DataFrame.ix ()? Thank.

+5
pandas slice indexing dataframe
Dec 26
source share
2 answers

This is mistake:

 In [1]: df = pd.DataFrame(np.random.randn(10, 4)) In [2]: df Out[2]: 0 1 2 3 0 -3.100926 -0.580586 -1.216032 0.425951 1 -0.264271 -1.091915 -0.602675 0.099971 2 -0.846290 1.363663 -0.382874 0.065783 3 -0.099879 -0.679027 -0.708940 0.138728 4 -0.302597 0.753350 -0.112674 -1.253316 5 -0.213237 -0.467802 0.037350 0.369167 6 0.754915 -0.569134 -0.297824 -0.600527 7 0.644742 0.038862 0.216869 0.294149 8 0.101684 0.784329 0.218221 0.965897 9 -1.482837 -1.325625 1.008795 -0.150439 In [3]: df.ix[-2:] Out[3]: 0 1 2 3 0 -3.100926 -0.580586 -1.216032 0.425951 1 -0.264271 -1.091915 -0.602675 0.099971 2 -0.846290 1.363663 -0.382874 0.065783 3 -0.099879 -0.679027 -0.708940 0.138728 4 -0.302597 0.753350 -0.112674 -1.253316 5 -0.213237 -0.467802 0.037350 0.369167 6 0.754915 -0.569134 -0.297824 -0.600527 7 0.644742 0.038862 0.216869 0.294149 8 0.101684 0.784329 0.218221 0.965897 9 -1.482837 -1.325625 1.008795 -0.150439 

https://github.com/pydata/pandas/issues/2600

Note that df[-2:] will work:

 In [4]: df[-2:] Out[4]: 0 1 2 3 8 0.101684 0.784329 0.218221 0.965897 9 -1.482837 -1.325625 1.008795 -0.150439 
+4
Dec 27
source share

ix The main goal is to enable numpy-like indexing, with support for row and column labels. Therefore, I am not sure that your use case is the intended goal. Here are a few ways that I can think of, mostly trivial:

 In [142]: df.ix[:][-2:] Out[142]: 0 1 2 3 8 0.386882 -0.836112 -0.108250 -0.433797 9 0.642468 -0.399255 -0.911456 -0.497720 In [161]: df.ix[df.index[-2:],:] Out[161]: 0 1 2 3 8 0.386882 -0.836112 -0.108250 -0.433797 9 0.642468 -0.399255 -0.911456 -0.497720 

I do not think that ix generally supports negative indexing. It seems to be just ignoring this:

 In [181]: df.ix[-100:,:] Out[181]: 0 1 2 3 0 -1.144137 -1.042034 -2.158838 0.674055 1 -0.424184 1.237318 -1.846130 0.575357 2 -0.844974 -0.541060 2.197364 -0.031898 3 0.846263 1.244450 -1.570566 -0.477919 4 -0.193445 0.171045 -0.235587 -1.185583 5 1.361539 -1.107389 -1.321081 -0.776407 6 0.505907 -1.364414 -2.093770 0.144016 7 -0.888465 -0.329153 0.491264 -0.363472 8 0.386882 -0.836112 -0.108250 -0.433797 9 0.642468 -0.399255 -0.911456 -0.497720 

Edit: from the pandas documentation we have:

Index-based indexing using integer-axis labels is a complex topic. This has been widely discussed on mailing lists and among various members of the Python scientific community. In pandas, our common view is that labels matter more than whole locations. Therefore, with the whole axis index, only label-based indexing can be used with standard tools such as .ix. The following code throws exceptions:

 s = Series(range(5)) s[-1] df = DataFrame(np.random.randn(5, 4)) df df.ix[-2:] 

This intentional decision was made to prevent ambiguities and subtleties (many users have reported errors when changing the API; stop "backing down" from position-based indexing).

+3
Dec 26 '12 at 4:14
source share



All Articles