How to get a random (bootable) sample from pandas multiindex

Question

How to get a random (bootable) sample from pandas multiindex

I am trying to create a loaded sample from a multi-index data in Pandas. Below is the code to generate the data I need.

from itertools import product import pandas as pd import numpy as np df = pd.DataFrame({'group1': [1, 1, 1, 2, 2, 3], 'group2': [13, 18, 20, 77, 109, 123], 'value1': [1.1, 2, 3, 4, 5, 6], 'value2': [7.1, 8, 9, 10, 11, 12] }) df = df.set_index(['group1', 'group2']) print df

The df data frame looks like this:

  value1 value2 group1 group2 1 13 1.1 7.1 18 2.0 8.0 20 3.0 9.0 2 77 4.0 10.0 109 5.0 11.0 3 123 6.0 12.0

I want to get a random sample from the first index. For example, let them say that the random values np.random.randint(3,size=3) produce [3,2,2]. I would like the resulting data structure to look like this:

  value1 value2 group1 group2 3 123 6.0 12.0 2 77 4.0 10.0 109 5.0 11.0 2 77 4.0 10.0 109 5.0 11.0

I spent a lot of time studying this, and I could not find a similar example where multiindex values are integers, the secondary index has a variable length and the primary index patterns are repeated. This is how I think the appropriate implementation for bootstrapping will work.

+4

python pandas sampling multi-index

Chris Aug 2 '16 at 23:08

source share

1 answer

piRSquared · Accepted Answer · 2016-08-02T23:48:41+0000

Try:

 df.unstack().sample(3, replace=True).stack()

How to get a random (bootable) sample from pandas multiindex

More articles: