Python json parser allows duplicate keys

I need to parse a json file, which, unfortunately, for me, does not follow the prototype. I have two data problems, but I already found a workaround for this, so I just mentioned it at the end, maybe someone can help there.

So, I need to parse such entries:

    "Test":{
        "entry":{
            "Type":"Something"
                },
        "entry":{
            "Type":"Something_Else"
                }
           }, ...

The json parser updates the dictionary by default, and theforfore uses only the last record. I need to somehow save the other, and I have no idea how to do this. I also have to store the keys in several dictionaries in the same order in which they appear in the file, so I use OrderedDict for this. it works great, so if there is any way to extend this with duplicate entries, I would be grateful.

The second problem is that this same json file contains such entries:

         "Test":{
                   {
                       "Type":"Something"
                   }
                }

The Json.load () function throws an exception when it reaches this line in the json file. The only way I worked was to manually manually remove the inner brackets.

Thanks in advance

+7
source share
2 answers

You can use JSONDecoder.object_pairs_hookto customize how JSONDecoderobjects decode. This hook function will be passed a list of pairs (key, value), which are usually processed, and then turned on dict.

, Python ( ), hook (key, value) JSON

from json import JSONDecoder

def parse_object_pairs(pairs):
    return pairs


data = """
{"foo": {"baz": 42}, "foo": 7}
"""

decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)
obj = decoder.decode(data)
print obj

:

[(u'foo', [(u'baz', 42)]), (u'foo', 7)]

, . , Python , . ? dct[key] .

, , , , , , , .


. , , , :

from collections import OrderedDict
from json import JSONDecoder


def make_unique(key, dct):
    counter = 0
    unique_key = key

    while unique_key in dct:
        counter += 1
        unique_key = '{}_{}'.format(key, counter)
    return unique_key


def parse_object_pairs(pairs):
    dct = OrderedDict()
    for key, value in pairs:
        if key in dct:
            key = make_unique(key, dct)
        dct[key] = value

    return dct


data = """
{"foo": {"baz": 42, "baz": 77}, "foo": 7, "foo": 23}
"""

decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)
obj = decoder.decode(data)
print obj

:

OrderedDict([(u'foo', OrderedDict([(u'baz', 42), ('baz_1', 77)])), ('foo_1', 7), ('foo_2', 23)])

make_unique . _n, n - .

object_pairs_hook , JSON, , OrderedDict, .

+14

@Lukas Graf, , hook

def dict_raise_on_duplicates(ordered_pairs):
  count=0
  d=collections.OrderedDict()
  for k,v in ordered_pairs:
      if k in d:
          d[k+'_dupl_'+str(count)]=v
          count+=1
      else:
          d[k]=v
  return d

, - , : D

0

All Articles