Pandas remove null values when to_json

Question

Pandas remove null values when to_json

I have the actual pandas framework and I want to save it in json format. The pandas docs say:

Note. NaN, NaT and None will be converted to null and datetime objects will be converted based on date_format and date_unit Parameters

Then using the orient records option, I have something like this

 [{"A":1,"B":4,"C":7},{"A":null,"B":5,"C":null},{"A":3,"B":null,"C":null}]

Is it possible:

 [{"A":1,"B":4,"C":7},{"B":5},{"A":3}]'

thanks

+7

json python pandas

mva Jun 18 '15 at 10:22

source share

3 answers

I have the same problem and my solution is to use json module instead of pd.DataFrame.to_json ()

My decision

dropping the value of NaN when converting a DataFrame to a dict, and then
convert dict to json using json.dumps ()

Here is the code:

 import pandas as pd import json from pandas import compat def to_dict_dropna(df): return {int(k): v.dropna().astype(int).to_dict() for k, v in compat.iteritems(df)} json.dumps(to_dict_dropna(df))

0

cssmlulu Aug 20 '15 at 8:43

source share

The above solution does not actually produce results in a “record” format. This solution also uses the json package, but gives exactly the result asked in the original question.

 import pandas as pd import json json.dumps([row.dropna().to_dict() for index,row in df.iterrows()])

Alternatively, if you want to include the index (and you are on Python 3.5+), you can do:

 json.dumps([{'index':index, **row.dropna().to_dict()} for index,row in df.iterrows()])

0

Dave decaprio Oct 26 '17 at 13:02

source share

Edchum · Accepted Answer · 2015-06-18T10:42:11+0000

The following comes close to what you want, essentially we create a list of values other than NaN, and then call to_json on this:

 In [136]: df.apply(lambda x: [x.dropna()], axis=1).to_json() Out[136]: '{"0":[{"a":1.0,"b":4.0,"c":7.0}],"1":[{"b":5.0}],"2":[{"a":3.0}]}'

you need to create a list, otherwise it will try to align the result with your original df form, and this will return the NaN values that you want to avoid:

 In [138]: df.apply(lambda x: pd.Series(x.dropna()), axis=1).to_json() Out[138]: '{"a":{"0":1.0,"1":null,"2":3.0},"b":{"0":4.0,"1":5.0,"2":null},"c":{"0":7.0,"1":null,"2":null}}'

also calling list on the result of dropna will translate the result using the form, for example, filling:

 In [137]: df.apply(lambda x: list(x.dropna()), axis=1).to_json() Out[137]: '{"a":{"0":1.0,"1":5.0,"2":3.0},"b":{"0":4.0,"1":5.0,"2":3.0},"c":{"0":7.0,"1":5.0,"2":3.0}}'

Pandas remove null values ​​when to_json

More articles:

Pandas remove null values when to_json