I have a dataframe in which all values have the same sort (for example, a correlation matrix, but where we expect a unique maximum). I would like to return the row and maximum column of this matrix.
I can get max by row or column by changing the first argument
df.idxmax()
however, I did not find a suitable way to return the row / column index max of the entire data frame.
For example, I can do this in numpy:
>>>npa = np.array([[1,2,3],[4,9,5],[6,7,8]])
>>>np.where(npa == np.amax(npa))
(array([1]), array([1]))
But when I try something like this in pandas:
>>>df = pd.DataFrame([[1,2,3],[4,9,5],[6,7,8]],columns=list('abc'),index=list('def'))
>>>df.where(df == df.max().max())
a b c
d NaN NaN NaN
e NaN 9 NaN
f NaN NaN NaN
At the second level , what I want to do is return rows and columns from the top n values , for example. like a series.
eg. for the above, I need a function that does:
>>>topn(df,3)
b e
c f
b f
dtype: object
>>>type(topn(df,3))
pandas.core.series.Series
or even just
>>>topn(df,3)
(['b','c','b'],['e','f','f'])
a la numpy.where ()