Pandas. Group by field and combine values in one line

Question

Pandas. Group by field and combine values in one line

I wanted to know how to group a data frame using a field, and then combine the groups into one row, giving priority to non-empty values. This is an example in which it is grouped by id:

+4

python pandas

E.Aarón Apr 21 '16 at 11:45

source share

2 answers

Customization

import pandas as pd

data = [
    [1, 'a', None, None],
    [1, None, 'r', None],
    [1, None, None, 's'],
    [2, None, 'd', None],
    [2, 'q', None, None],
    [3, None, 'b', None],
    [3, 'w', None, None]
]

df = pd.DataFrame(data, columns=['id', 'A', 'B', 'C'])

df looks like

   id     A     B     C
0   1     a  None  None
1   1  None     r  None
2   1  None  None     s
3   2  None     d  None
4   2     q  None  None
5   3  None     b  None
6   3     w  None  None

Decision

df.set_index('id').stack().unstack()

Looks like

    A  B     C
id            
1   a  r     s
2   q  d  None
3   w  b  None

+1

piRSquared Apr 21 '16 at 13:13

source share

jezrael · Accepted Answer · 2016-04-21T12:42:41+0000

I think you can use replacewith groupbyand sum:

print df.replace('null', '').groupby('id').sum().replace('', 'null')
    A  B     C
id            
1   a  r     s
2   q  d  null
3   w  b  null

If the value type is nullnot string, use fillna:

print df.fillna('').groupby('id').sum().replace('', 'null')
    A  B     C
id            
1   a  r     s
2   q  d  null
3   w  b  null

Pandas. Group by field and combine values ​​in one line

More articles:

Pandas. Group by field and combine values in one line