Re-indexing error does not make sense

I have DataFrames from 100 to 2 m. The one that I came across in this question is great, but note that I will have to do the same for other frames:

 >>> len(data) 357451 

Now this file was created by compiling many files, so the index for it is really strange. So all I wanted to do was reindex it with range(len(data)) , but I get this error:

 >>> data.reindex(index=range(len(data))) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 2542, in reindex fill_value, limit) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 2618, in _reindex_index limit=limit) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 893, in reindex limit=limit) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 812, in get_indexer raise Exception('Reindexing only valid with uniquely valued Index ' Exception: Reindexing only valid with uniquely valued Index objects 

It doesnโ€™t actually make sense. Since I'm reindexed with an array containing numbers from 0 to 357450, all index objects are unique! Why is he returning this error?

Additional info: I am using python2.7 and pandas 11.0

+4
source share
1 answer

When he complains that the Reindexing only valid with uniquely valued Index , he does not object that your new index is not unique, he objects that your old is not.

For instance:

 >>> df = pd.DataFrame(range(5), index = [1,2,3,1,2]) >>> df 0 1 0 2 1 3 2 1 3 2 4 >>> df.reindex(index=range(len(df))) Traceback (most recent call last): [...] File "/usr/local/lib/python2.7/dist-packages/pandas-0.12.0.dev_0bd5e77-py2.7-linux-i686.egg/pandas/core/index.py", line 849, in get_indexer raise Exception('Reindexing only valid with uniquely valued Index ' Exception: Reindexing only valid with uniquely valued Index objects 

but

 >>> df.index = range(len(df)) >>> df 0 0 0 1 1 2 2 3 3 4 4 

Although I think I will write

 df.reset_index(drop=True) 

instead.

+7
source

All Articles