Custom Python JSON encoder with pre-computed JSON literal

Question

Custom Python JSON encoder with pre-computed JSON literal

I have a special object that can contain a literal json string, which I intend to use as a field in a larger JSON object, as the literal value itself (and not a string containing JSON).

I want to write my own encoder that can do this, i.e.

> encoder.encode({ > 'a': LiteralJson('{}') > }) {"a": {}}

I don’t think that subclassing JSONEncoder and overriding by default will work, because at best I can return a string that will produce the result {"a": "{}"} .

Encoding overrides also do not work when LiteralJson is nested somewhere inside another dictionary.

The background for this, if you're interested, is that I store JSON-encoded values in the cache, and it seems to me that this is a waste for deserialization, and then reinitialization. It works that way, but some of these values are quite long, and it just seems like a huge waste.

The following encoder will do what I like (but it seems unnecessarily slow):

 class MagicEncoder(json.JSONEncoder): def default(self, obj): if isinstance(obj, LiteralJson): return json.loads(obj.content) else: return json.JSONEncoder.default(self, obj)

+8

json python

Kevin dolan 12 sept '12 at 23:09

source share

2 answers

Bruno · Answer 1 · 2012-11-03T03:59:14+0000

I just realized that I recently had a similar question . The answer suggested using a replacement token.

You can more or less transparently integrate this logic with a custom JSONEncoder that generates these tokens internally using a random UUID. (What I called " RawJavaScriptText " is equivalent to your " LiteralJson ".)

You can directly use json.dumps(testvar, cls=RawJsJSONEncoder) .

 import json import uuid class RawJavaScriptText: def __init__(self, jstext): self._jstext = jstext def get_jstext(self): return self._jstext class RawJsJSONEncoder(json.JSONEncoder): def __init__(self, *args, **kwargs): json.JSONEncoder.__init__(self, *args, **kwargs) self._replacement_map = {} def default(self, o): if isinstance(o, RawJavaScriptText): key = uuid.uuid4().hex self._replacement_map[key] = o.get_jstext() return key else: return json.JSONEncoder.default(self, o) def encode(self, o): result = json.JSONEncoder.encode(self, o) for k, v in self._replacement_map.iteritems(): result = result.replace('"%s"' % (k,), v) return result testvar = { 'a': 1, 'b': 'abc', 'c': RawJavaScriptText('{ "x": [ 1, 2, 3 ] }') } print json.dumps(testvar, cls=RawJsJSONEncoder)

Result (using Python 2.6 and 2.7):

 {"a": 1, "c": { "x": [ 1, 2, 3 ] }, "b": "abc"}

kibibu · Answer 2 · 2014-02-12T05:04:20+0000

It seems to me that this is waste for deserialization, and then re-serialize all the time.

This is wasteful, but just in case someone was looking for a quick fix, this approach works great.

Extracting an example from Bruno:

 testvar = { 'a': 1, 'b': 'abc', 'c': json.loads('{ "x": [ 1, 2, 3 ] }') } print json.dumps(testvar)

Result:

 {"a": 1, "c": {"x": [1, 2, 3]}, "b": "abc"}

Custom Python JSON encoder with pre-computed JSON literal

More articles: