In Python, use a "dict" with keywords or anonymous dictionaries?

Say that you want to pass a dictionary of values ​​to a function, or else you want to work with a short-lived dictionary that will not be reused. There are two easy ways to do this:

Use the dict() function to create a dictionary:

 foo.update(dict(bar=42, baz='qux')) 

Use anonymous dictionary:

 foo.update({'bar': 42, 'baz': 'qux'}) 

What do you prefer? Are there other reasons besides personal style for choosing one of them?

+6
python
source share
9 answers

I prefer the anonymous dict option.

I don't like the dict() parameter for the same reason that I don't like:

  i = int("1") 

With the dict() option, you ignore a function that adds overhead that you don't need:

 >>> from timeit import Timer >>> Timer("mydict = {'a' : 1, 'b' : 2, 'c' : 'three'}").timeit() 0.91826782454194589 >>> Timer("mydict = dict(a=1, b=2, c='three')").timeit() 1.9494664824719337 
+15
source share

I think that in this particular case, I would prefer this:

 foo.update(bar=42, baz='qux') 

In a more general case, I often prefer the literal syntax (what you call an anonymous dictionary, although it uses {} just as anonymously as it uses dict() ). I think this speaks more clearly for the maintenance programmer (often me), in part because he stands out so nicely with text editors with syntax highlighting. It also guarantees that when I have to add a key that does not appear as a Python name, for example, something with spaces, then I do not need to rewrite the entire line.

+6
source share

My answer will mainly talk about API design for using dicts vs. keyword args. But this also applies to individual use {...} vs. dict(...) .

The bottom line: be consistent. If most of your code will refer to 'bar' as a string - save it in the string {...} ; if you usually refer to it with the bar identifier, use dict(bar=...) .

Limitations

Before talking about style, note that the syntax of the bar=42 keyword works only for strings and only if they are valid identifiers. If you need arbitrary punctuation, spaces, unicode - or even non-string keys - the question is completed => the syntax {'bar': 42} will work.

It also means that when developing an API, you must allow full dicts, not just keyword arguments - unless you are sure that only strings and only valid identifiers are allowed. (Technically, update(**{'spaces & punctuation': 42}) works, but it's ugly, and numbers / tuples / unicode won't work.)

Note that dict() and dict.update() combine both APIs: you can pass one dict, you can pass args keywords, and you can even pass both (the later one, I think, is undocumented). Therefore, if you want to be nice, allow both:

 def update(self, *args, **kwargs): """Callable as dict() - with either a mapping or keyword args: .update(mapping) .update(**kwargs) """ mapping = dict(*args, **kwargs) # do something with `mapping`... 

This is especially recommended for a method named .update() to follow the rule of least surprise.

Style

I am pleased to distinguish between internal and external lines. By internal, I mean arbitrary identifiers that mean something only inside the program (variable names, object attributes) or, possibly, between several programs (DB columns, XML attribute names). They are usually visible only to developers. Outer strings are for human consumption.

[Some Python encoders (including me) abide by the convention of using 'single_quotes' for inner strings vs. "Double quotes" for outer strings. This is definitely not universal.]

Your question is about the proper use of simple words (the Perl term) - syntactic sugars that allow you to omit quotation marks in internal lines altogether. Some languages ​​(especially LISP) allow them widely; Pythonic's ability to use simple words is access to attributes - foo.bar and keyword arguments - update(bar=...) .

The stylistic dilemma here is " Are your lines internal enough to look like identifiers? "

If the keys are external strings, the answer is definitely NO:

 foo.update({"The answer to the big question": 42}) # which you later might access as: foo["The answer to the big question"] 

If the keys are for Python identifiers (e.g. object attributes), I would say YES:

 foo.update(dict(bar=42)) # As others mentioned, in that case the cleaner API (if possible) # would be to receive them as **kwargs directly: foo.update(bar=42) # which you later might access as: foo.bar 

If the keys refer to identifiers outside of your Python program, such as XML attribute names or database column names, using simple words may be a good or bad choice - but you better choose one style and be consistent.

Consistency is good because there is a psychological barrier between identifiers and strings. It exists because lines rarely cross it - only when using introspection for metaprogramming. And syntax highlighting only enhances it. Therefore, if you read the code and see a green 'bar' in one place and a black foo.bar in second place, you cannot immediately establish a connection.

Another important rule: Bars are good if they are (mostly) fixed . For example. if you refer to the fixed columns of the database mainly to your code, then using descriptors to refer to them may be nice; but if half the time of the column is a parameter, it is better to use rows.

This is because the parameter / constant is the most important difference that people associate with the identifier / string barrier. The difference between column (variable) and "person" (constant) is the most readable way to convey this difference. Creating both identifiers will blur the difference, as well as a syntactic backtrack - you will need to use **{column: value} and getattr(obj, column) , etc. Lot.

+4
source share

I prefer your "anonymous dictionary" method, and I think this is a purely personal matter. I just find the latest version more readable, but it's also what I'm used to seeing.

+2
source share

The dict () method has the additional overhead of calling a function.

 >>>import timeit,dis >>> timeit.Timer("{'bar': 42, 'baz': 'qux'}").repeat() [0.59602910425766709, 0.60173793037941437, 0.59139834811408321] >>> timeit.Timer("dict(bar=42, baz='qux')").repeat() [0.98166498814792646, 0.97745355904172015, 0.99231773870701545] >>> dis.dis(compile("{'bar': 42, 'baz': 'qux'}","","exec")) 1 0 BUILD_MAP 0 3 DUP_TOP 4 LOAD_CONST 0 (42) 7 ROT_TWO 8 LOAD_CONST 1 ('bar') 11 STORE_SUBSCR 12 DUP_TOP 13 LOAD_CONST 2 ('qux') 16 ROT_TWO 17 LOAD_CONST 3 ('baz') 20 STORE_SUBSCR 21 POP_TOP 22 LOAD_CONST 4 (None) 25 RETURN_VALUE >>> dis.dis(compile("dict(bar=42, baz='qux')","","exec")) 1 0 LOAD_NAME 0 (dict) 3 LOAD_CONST 0 ('bar') 6 LOAD_CONST 1 (42) 9 LOAD_CONST 2 ('baz') 12 LOAD_CONST 3 ('qux') 15 CALL_FUNCTION 512 18 POP_TOP 19 LOAD_CONST 4 (None) 22 RETURN_VALUE 
+2
source share

I also prefer an anonymous dictionary just from a personal style.

+1
source share

If I have many arguments, sometimes it’s nice to omit the quotation marks on the keys:

 DoSomething(dict( Name = 'Joe', Age = 20, Gender = 'Male', )) 

This is a very subjective question, BTW. :)

+1
source share

I think the dict() function really exists when you create a dict from something else, perhaps something that easily creates the necessary args keywords. An anonymous method works best for "dict literals" in the same way as for strings, and not for str ().

+1
source share

In fact, if the receiving function receives only a dictionary with unprocessed keywords, I usually use the ** transfer convention.

In this example, it will be:

 class Foo(object): def update(self, **param_dict): for key in param_dict: .... foo = Foo() .... foo.update(bar=42, baz='qux') 
0
source share

All Articles