Why is json.loads an order of magnitude faster than ast.literal_eval?

After answering the question about how to parse a text file containing arrays of floats , I performed the following test:

import timeit import random line = [random.random() for x in range(1000)] n = 10000 json_setup = 'line = "{}"; import json'.format(line) json_work = 'json.loads(line)' json_time = timeit.timeit(json_work, json_setup, number=n) print "json: ", json_time ast_setup = 'line = "{}"; import ast'.format(line) ast_work = 'ast.literal_eval(line)' ast_time = timeit.timeit(ast_work, ast_setup, number=n) print "ast: ", ast_time print "time ratio ast/json: ", ast_time / json_time 

I ran this code several times and consistently received the following results:

 $ python json-ast-bench.py json: 4.3199338913 ast: 28.4827561378 time ratio ast/json: 6.59333148483 

So it turns out that json almost an order of magnitude faster than ast for this use case.

I had the same results with both Python 2.7.5+ and Python 3.3.2 +.

Questions:

  • Why is json.loads so much faster? This question means that ast is more flexible about input (double or single quotes)
  • Are there any cases where I prefer to use ast.literal_eval over json.loads , although it is slower?

Edit: In any case, if performance matters, I would recommend using UltraJSON (only what I use in work is ~ 4 times faster than json using the same mini-test).

+7
json python benchmarking parsing
source share
1 answer

Two functions analyze completely different languages ​​- JSON and Python literal syntax. * Because literal_eval says:

A string or node can only contain the following literary Python structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None .

JSON , by contrast, only double-quoted string literals (not quite identical to Python **), JavaScript numbers (only int and float ***), objects (approximately equivalent to dicts), arrays (approximately equivalent to lists), JavaScript booleans (which are processed) are processed differ from Python) and null .

The fact that these two languages ​​have some overlap does not mean that they are one and the same language.


Why is json.loads so much faster?

Since Python literal syntax is a more complex and powerful language than JSON, it will most likely be parsed more slowly. And, perhaps more importantly, because the literal syntax of Python is not intended to be used as a data exchange format (in fact, it is not specifically intended for this), no one can put much effort into speeding up data exchange. ****

This question seems to mean that ast is more flexible with respect to input (double or single quotes)

These are raw string literals, as well as Unicode and literal string literals, as well as complex numbers and sets, as well as all other things that JSON does not process.

Are there any cases where I prefer to use ast.literal_eval over json.loads, although it is slower?

Yes. When you want to parse Python literals, you should use ast.literal_eval . (Or, better yet, think about your design so you don't want to parse Python literals ...)


* This is a slightly vague term. For example, -2 not literal in Python, but an operator expression, but literal_eval can handle this. And, of course, tuple / list / dict / set mappings are not literals, but literal_eval can process them, except that messages are also displayed, and literal_eval cannot process them. Other functions in the ast module can help you find out what actually is and is not a literal - for example, ast.dump(ast.parse("expr")) .

** For example, "\q" is a JSON error.

*** Technically, JSON handles only one type of "number", which is a floating point. But the Python json module parses numbers without a decimal point or exponent as integers, and the same is true for many JSON modules in other languages.

**** If you missed Tim Peters' comment on the question: " ast.literal_eval so easy to use that no one felt it was worth the time to work (work, work), speeding it up. On the contrary, JSON libraries are usually used to analyze gigabytes data.

+8
source share

All Articles