Pandas. Group by field and combine values ​​in one line

I wanted to know how to group a data frame using a field, and then combine the groups into one row, giving priority to non-empty values. This is an example in which it is grouped by id:

enter image description here

+4
source share
2 answers

I think you can use replacewith groupbyand sum:

print df.replace('null', '').groupby('id').sum().replace('', 'null')
    A  B     C
id            
1   a  r     s
2   q  d  null
3   w  b  null

If the value type is nullnot string, use fillna:

print df.fillna('').groupby('id').sum().replace('', 'null')
    A  B     C
id            
1   a  r     s
2   q  d  null
3   w  b  null
+1
source

Customization

import pandas as pd

data = [
    [1, 'a', None, None],
    [1, None, 'r', None],
    [1, None, None, 's'],
    [2, None, 'd', None],
    [2, 'q', None, None],
    [3, None, 'b', None],
    [3, 'w', None, None]
]

df = pd.DataFrame(data, columns=['id', 'A', 'B', 'C'])

df looks like

   id     A     B     C
0   1     a  None  None
1   1  None     r  None
2   1  None  None     s
3   2  None     d  None
4   2     q  None  None
5   3  None     b  None
6   3     w  None  None

Decision

df.set_index('id').stack().unstack()

Looks like

    A  B     C
id            
1   a  r     s
2   q  d  None
3   w  b  None
+1
source

All Articles