pandas DataFrame reset_index that can handle duplicate column names?

Question

pandas DataFrame reset_index that can handle duplicate column names?

Is there any equivalent pandas.DataFrame.reset_index() that works with columns and can handle the case of duplicate column names? I want it to throw away column names and return the default index of 0.1,2 .. for columns. (Methods like df.rename or df.reindex_axis do not work if I have duplicate column names.)

Input Example:

  pd.DataFrame(np.random.rand(5, 3), columns = ['A', 'A', 'B']) AAB 0 0.5 0.3 0.9 1 0.7 0.9 0.3 2 0.9 0.4 0.8 3 0.6 0.2 0.9 4 0.7 0.4 0.6

Expected Result:

  0 1 2 0 0.8 0.1 0.2 1 0.4 0.2 0.4 2 0.3 0.3 0.4 3 0.4 0.1 0.8 4 1.0 0.9 0.9

+8

python pandas duplicates dataframe

Flab Jul 21 '16 at 10:43

source share

2 answers

Use range with column lengths by shape :

 df.columns = range(df.shape[1]) print (df) 0 1 2 0 0.228080 0.884450 0.753401 1 0.176790 0.741979 0.525305 2 0.680255 0.730258 0.449681 3 0.169420 0.660825 0.986554 4 0.302204 0.040413 0.902899

Another solution with double transpose T and reset_index with drop=True parameter:

 df = df.T.reset_index(drop=True).T print (df) 0 1 2 0 0.024846 0.688193 0.887926 1 0.284681 0.895319 0.142876 2 0.440834 0.299527 0.762815 3 0.936967 0.928907 0.642960 4 0.801077 0.085773 0.866651

+5

jezrael Jul 21 '16 at 10:44

source share

Maxu · Accepted Answer · 2016-07-21T11:14:31+0000

you can use set_axis () method :

 In [54]: df Out[54]: AAB 0 0.934900 0.817182 0.166270 1 0.064543 0.139431 0.249576 2 0.709349 0.731913 0.965048 3 0.284955 0.479898 0.496652 4 0.520749 0.464256 0.999993 In [55]: df.set_axis(1, range(len(df.columns))) In [56]: df Out[56]: 0 1 2 0 0.934900 0.817182 0.166270 1 0.064543 0.139431 0.249576 2 0.709349 0.731913 0.965048 3 0.284955 0.479898 0.496652 4 0.520749 0.464256 0.999993

pandas DataFrame reset_index that can handle duplicate column names?

More articles: