Parsing JSON with python: empty fields

I'm having trouble parsing JSON with python, and now I'm stuck.
The problem is that the entities of my JSON are not always the same. JSON is something like:

"entries":[ { "summary": "here is the sunnary", "extensions": { "coordinates":"coords", "address":"address", "name":"name" "telephone":"123123" "url":"www.blablablah" }, } ] 

I can navigate through JSON, for example:

 for entrie in entries: name =entrie['extensions']['name'] tel=entrie['extensions']['telephone'] 

The problem arises because sometimes JSON does not have all the "fields", for example, the telephone field, it is sometimes absent, therefore the script ends with a KeyError error, because the phone key is not in this entry.
So my question is: how could I run this script, leaving an empty place where the phone is missing? I tried with:

 if entrie['extensions']['telephone']: tel=entrie['extensions']['telephone'] 

but I think this is not normal.

+8
json python parsing
source share
4 answers

Use dict.get instead of [] :

 entries['extensions'].get('telephone', '') 

Or simply:

 entries['extensions'].get('telephone') 

get will return the second argument (default, None ) instead of raising a KeyError when the key is not found.

+11
source share

If data is missing only in one place, then dict.get can be used to fill in the missing value:

 tel = d['entries'][0]['extensions'].get('telelphone', '') 

If the problem is more common, you can use the JSON parser defaultdict or a custom dictionary instead of a regular dictionary. For example, given a JSON string:

 json_txt = '''{ "entries": [ { "extensions": { "telephone": "123123", "url": "www.blablablah", "name": "name", "coordinates": "coords", "address": "address" }, "summary": "here is the summary" } ] }''' 

Disassemble it with:

 >>> class BlankDict(dict): def __missing__(self, key): return '' >>> d = json.loads(json_txt, object_hook=BlankDict) >>> d['entries'][0]['summary'] u'here is the summary' >>> d['entries'][0]['extensions']['color'] '' 

As a side note, if you want to clear your datasets and ensure consistency, there is a great tool called Kwalify that does the schema check for JSON (and YAML);

+8
source share

There are several useful dictionary functions that you can use to work with this.

First, you can use in to check for the presence or absence of a key in the dictionary:

 if 'telephone' in entrie['extensions']: tel=entrie['extensions']['telephone'] 

get can also be useful; it allows you to specify a default value if the key is missing:

 tel=entrie['extensions'].get('telephone', '') 

Alternatively, you can look at the standard collections.defaultdict library, but that might be redundant.

0
source share

Two ways.

First, make sure your dictionaries are standard, and when you read them, they have all the fields. Another must be careful when accessing dictionaries.

Here is an example of how your dictionaries are standard:

 __reference_extensions = { # fill in with all standard keys # use some default value to go with each key "coordinates" : '', "address" : '', "name" : '', "telephone" : '', "url" : '' } entrie = json.loads(input_string) d = entrie["extensions"] for key, value in __reference_extensions: if key not in d: d[key] = value 

Here is an example of caution when accessing dictionaries:

 for entrie in entries: name = entrie['extensions'].get('name', '') tel = entrie['extensions'].get('telephone', '') 
0
source share

All Articles