Performing arithmetic with the multi-indexed pandas framework, which needs translation at several levels

Question

Performing arithmetic with the multi-indexed pandas framework, which needs translation at several levels

I have a dataframe that looks like this:

one two three 1 2 1 2 1 2 XYXYXYXYXYXY a 0.3 -0.6 -0.3 -0.2 1.5e+00 0.3 -1.0e+00 1.2 0.6 -9.8e-02 -0.4 0.4 b -0.6 -0.4 -1.1 2.3 -7.4e-02 0.7 -7.4e-02 -0.5 -0.3 -6.8e-01 1.1 -0.1

How to split all df elements into df["three"] ?

I tried df.div(df["three"],level=[1,2]) with no luck.

+4

python pandas

avs Aug 6 '15 at 3:58

source share

1 answer

John · Accepted Answer · 2015-08-06T15:02:17+0000

Here is one insert.

 df / pd.concat( [ df.three ] * 3, axis=1 ).values

And here's another way, which is slightly less concise, but may be more readable.

 df2 = df.copy() for c in df.columns.levels[0]: df2[c] = df[c] / df['three']

And finally, here is a longer solution with more explanations. I did it this way before I realized that there are better ways. But I will keep it here, as it is more informative about what is going on behind the scenes in such an operation (although it is possible, bust).

Firstly, a multi-index does not copy well, so I will create a data sample that is very similar.

 np.random.seed(123) tuples = list(zip(*[['one', 'one', 'two', 'two', 'three', 'three'], ['foo', 'bar', 'foo', 'bar', 'foo', 'bar']])) index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second']) df = pd.DataFrame(np.random.randn(3, 6), index=['A', 'B', 'C'], columns=index) first one two three second foo bar foo bar foo bar A -1.085631 0.997345 0.282978 -1.506295 -0.578600 1.651437 B -2.426679 -0.428913 1.265936 -0.866740 -0.678886 -0.094709 C 1.491390 -0.638902 -0.443982 -0.434351 2.205930 2.186786

The simplest approach is likely to expand the denominator by 3 to fit the size of a full data frame. Alternatively, you can iterate over the columns, but then you have to re-combine them, which may not be as easy as you think in the case of multi-index. So pass column “three” like this.

 denom = pd.concat( [df['three']]*3, axis=1 ) denom = pd.DataFrame( denom.values, columns=df.columns, index=df.index ) first one two three second foo bar foo bar foo bar A -0.578600 1.651437 -0.578600 1.651437 -0.578600 1.651437 B -0.678886 -0.094709 -0.678886 -0.094709 -0.678886 -0.094709 C 2.205930 2.186786 2.205930 2.186786 2.205930 2.186786

The first line of "denom" simply extends the column "three" so that it is the same shape as the existing frame. The second "character" is required to match the row and column indices. Now you can simply write the normal division operation.

 df / denom first one two three second foo bar foo bar foo bar A 1.876305 0.603926 -0.489074 -0.912112 1 1 B 3.574501 4.528744 -1.864725 9.151619 1 1 C 0.676082 -0.292165 -0.201267 -0.198625 1 1

A quick note on one insert regarding this longer solution. values in one insert is converted from the data array to the array, which has a convenient side effect for erasing row and column indices. As an alternative to this longer solution, I explicitly agree on the indices. Depending on your situation, any approach may be the best way.

Performing arithmetic with the multi-indexed pandas framework, which needs translation at several levels

More articles: