Transpose dictionary (extract all values ​​for one key from the dictionary list)

I have a list of such dictionaries:

data = [{'x': 1, 'y': 10}, {'x': 3, 'y': 15}, {'x': 2, 'y': 1}, ... ] 

I have a function (e.g. matplotlib.axis.plot ) that needs lists of x and y values. Therefore, I have to "transpose" the dictionary ".

First question: what do you call this operation? Is the "transpose" the correct term?

I tried this, but I'm looking for an efficient way (maybe there is a special numpy function):

 x = range(100) y = reversed(range(100)) d = [dict((('x',xx), ('y', yy))) for (xx, yy) in zip(x,y)] # d is [{'y': 99, 'x': 0}, {'y': 98, 'x': 1}, ... ] timeit.Timer("[dd['x'] for dd in d]", "from __main__ import d").timeit() # 6.803985118865967 from operator import itemgetter timeit.Timer("map(itemgetter('x'), d)", "from __main__ import d, itemgetter").timeit() # 7.322326898574829 timeit.Timer("map(f, d)", "from __main__ import d, itemgetter; f=itemgetter('x')").timeit() # 7.098556041717529 # quite dangerous timeit.Timer("[dd.values()[1] for dd in d]", "from __main__ import d").timeit() # 19.358459949493408 

Is there a better solution? I doubt that in these cases the hash of the string 'x' recounted every time?

+7
source share
3 answers

Form theft from this answer

 import timeit from operator import itemgetter from itertools import imap x = range(100) y = reversed(range(100)) d = [dict((('x',xx), ('y', yy))) for (xx, yy) in zip(x,y)] # d is [{'y': 99, 'x': 0}, {'y': 98, 'x': 1}, ... ] D={x:y for x,y in zip(range(10),reversed(range(10)))} def test_list_comp(d): return [dd['x'] for dd in d] def test_list_comp_v2(d): return [(x["x"], x["y"]) for x in d] def testD_keys_values(d): return d.keys() def test_map(d): return map(itemgetter('x'), d) def test_positional(d): return [dd.values()[1] for dd in d] def test_lambda(d): return list(imap(lambda x: x['x'], d)) def test_imap_iter(d): return list(imap(itemgetter('x'), d)) for test in sorted(globals()): if test.startswith("test_"): print "%30s : %s" % (test, timeit.Timer("f(d)", "from __main__ import %s as f, d" % test).timeit()) for test in sorted(globals()): if test.startswith("testD_"): print "%30s : %s" % (test, timeit.Timer("f(D)", "from __main__ import %s as f, D" % test).timeit()) 

It gives the following results:

  test_imap_iter : 8.98246016151 test_lambda : 15.028239837 test_list_comp : 5.53205787458 test_list_comp_v2 : 12.1928668102 test_map : 6.38402269826 test_positional : 20.2046790578 testD_keys_values : 0.305969839705 

Obviously, the biggest victory is getting your data in a format close to what you already need, but you cannot control it.

As for the name, I would call it a transformation.

+1
source

If you just need to iterate over the values, you can consider this method:

 imap(lambda x: x['x'], d) 
0
source

Why not something like this?

 [(x["x"], x["y"]) for x in d] 

which will return a list of tuples containing x and y positions. I'm not sure about his speed, but he will get rid of the overhead lambda.

0
source

All Articles