Dictionary column in pandas dataframe

Question

Dictionary column in pandas dataframe

I have a csv which I read in the pandas framework. However, one of the columns is in the form of a dictionary. Here is an example:

ColA, ColB, ColC, ColdD
20, 30, {"ab":"1", "we":"2", "as":"3"},"String"

How can I turn this into a data framework that looks like this:

ColA, ColB, AB, WE, AS, ColdD
20, 30, "1", "2", "3", "String"

edit I solved the question, it looks like this, but this is a string that needs to be parsed, not a dict object.

+4

python dictionary pandas

user1274037 Mar 29 '15 at 3:56

source share

3 answers

Bob haffner · Answer 1 · 2015-03-29T15:15:27+0000

So starting from your one line df

    Col A   Col B   Col C                           Col D
0   20      30      {u'we': 2, u'ab': 1, u'as': 3}  String1

EDIT: based on OP comment, I assume we need to convert the string first

import ast
df["ColC"] =  df["ColC"].map(lambda d : ast.literal_eval(d))

then we convert Col C to dict, transpose it, and then attach to the original df

dfNew = df.join(pd.DataFrame(df["Col C"].to_dict()).T)
dfNew

which gives you this

    Col A   Col B   Col C                           Col D   ab  as  we
0   20      30      {u'we': 2, u'ab': 1, u'as': 3}  String1 1   3   2

Then we just select the columns we want to find in dfNew

dfNew[["Col A", "Col B", "ab", "we", "as", "Col D"]]

    Col A   Col B   ab  we  as  Col D
0   20      30      1   2   3   String1

psychemedia · Answer 2 · 2017-02-01T01:41:18+0000

fooobar.com/questions/279482/... .apply(pd.Series) , dict, , dict, :

dw=pd.DataFrame( [[20, 30, {"ab":"1", "we":"2", "as":"3"},"String"]],
                columns=['ColA', 'ColB', 'ColC', 'ColdD'])
pd.concat([dw.drop(['ColC'], axis=1), dw['ColC'].apply(pd.Series)], axis=1)

:

ColA    ColB    ColdD   ab  as  we
20      30      String  1   3   2

jedwards · Answer 3 · 2015-03-29T05:30:30+0000

Sort of:

import pandas as pd

# Create mock dataframe
df = pd.DataFrame([
    [20, 30, {'ab':1, 'we':2, 'as':3}, 'String1'],
    [21, 31, {'ab':4, 'we':5, 'as':6}, 'String2'],
    [22, 32, {'ab':7, 'we':8, 'as':9}, 'String2'],
], columns=['Col A', 'Col B', 'Col C', 'Col D'])

# Create dataframe where you'll store the dictionary values
ddf = pd.DataFrame(columns=['AB','WE','AS'])

# Populate ddf dataframe
for (i,r) in df.iterrows():
    e = r['Col C']
    ddf.loc[i] = [e['ab'], e['we'], e['as']]

# Replace df with the output of concat(df, ddf)
df = pd.concat([df, ddf], axis=1)

# New column order, also drops old Col C column
df = df[['Col A', 'Col B', 'AB', 'WE', 'AS', 'Col D']]

print(df)

Conclusion:

   Col A Col B AB WE AS Col D
0 20 30 1 2 3 String1
1 21 31 4 5 6 String2
2 22 32 7 8 9 String2

Dictionary column in pandas dataframe

More articles: