Merge multiple pandas columns into a new column

I have a data frame in which some columns indicate whether a set of survey questions has been considered. For instance:

Q1_Seen Q2_Seen Q3_Seen Q4_Seen Q1a nan nan nan nan Q2a nan nan nan nan Q3d nan nan Q2c nan nan 

I would like to collapse these columns into one column, say Q_Seen , which will take the form:

 Q_Seen Q1a Q2a Q3d Q2c 

Note that each row will be mutually exclusive: if one of the columns has a value, all the rest will be NaN.

I tried to do this with pd.concat , but it did not seem to give the correct results.

+5
source share
3 answers

Try the following:

 df['Q_Seen'] = df.stack().values >>> df Q1_Seen Q2_Seen Q3_Seen Q4_Seen Q_Seen Q1a nan nan nan Q1a nan Q2a nan nan Q2a nan nan Q3d nan Q3d nan Q2c nan nan Q2c 
+3
source

Using the columns max() - i.e. max(axis=1) - will allow you to collapse all values ​​into a single column:

 In [1]: import pandas as pd In [2]: df = pd.DataFrame({"Q1_Seen": ['Q1a', None, None, None], "Q2_Seen": [None, "Q2a", None, "Q2c"], "Q3_Seen": [None, None, "Q3d", None],"Q4_Seen": [None, None, None, None]}) In [3]: df Out[3]: Q1_Seen Q2_Seen Q3_Seen Q4_Seen 0 Q1a None None None 1 None Q2a None None 2 None None Q3d None 3 None Q2c None None In [4]: df['Q_Seen'] = df.max(axis=1) In [5]: df Out[5]: Q1_Seen Q2_Seen Q3_Seen Q4_Seen Q_Seen 0 Q1a None None None Q1a 1 None Q2a None None Q2a 2 None None Q3d None Q3d 3 None Q2c None None Q2c 
+4
source

The following worked for me:

 df = pd.DataFrame({'Q1': [1, None, None], 'Q2': [None, 2, None], 'Q3': [None, None, 3]}) df['Q'] = df.concat([df['Q1'], df['Q2'], df['Q3']]).dropna() 

There may be a more elegant solution, but this is what first appeared in my head.

+1
source

All Articles