Can a regular expression be used as a key in a dictionary?

I want to create a dictionary in which keys are regular expressions:

d = {'a.*': some_value1, 'b.*': some_value2} 

Then, when I look in the dictionary:

 d['apple'] 

I want apple 'apple' match keys that are regular expressions. If there is a complete match with the key / regular expression, then the corresponding value should be returned.

For example, 'apple' matches the regular expression 'a.*' some_value1 , so some_value1 returned.

Of course, all this assumes that the regex keys do not conflict (i.e. the two keys should not match exactly the same). Say I can take care of this manually when creating my keys.

Is this possible in Python? If so, it will be a pretty elegant and powerful design!

+7
python dictionary regex
source share
3 answers

Python dictionaries are implemented as hash tables - this means that any search for mydict[myvalue] very fast, internally hashing myvalue . Using regular expressions in the form of keys cancels this function. Instead of using a dictionary, you should use a simple list or tuple, where each element is a tuple in the format: (pattern/compiled regular expression, value) and scan them until the regular expression passes. It will also give you the opportunity to play with the order of regular expressions (from specific to general, for example):

 import re LOOKUPS = [ ('a.*', 'a'), ('b.*', 'b'), ] def lookup(s, lookups): for pattern, value in lookups: if re.search(pattern, s): return value return None print(lookup("apple", LOOKUPS)) 

See also Django url resolver for a (very) advanced implementation of your idea.

+5
source share

You can use the re.compile d template object as a dictionary word:

 >>> import re >>> regex = re.compile('a.*') >>> d = {regex: 'foo'} >>> d[re.compile('a.*')] 'foo' 

Note that recompiling the same regular expression gives you an equal key (the same object: re.compile('a.*') is d.keys()[0] ), so you can return whatever you keep against him.

But:

  • As stated in the comments, multiple regular expressions can match the same line;
  • Dictionaries are not ordered, so every time you run the program, you can get another matching regular expression; and
  • There is no way O(1) query the dictionary {regex: result, ...} for the result value given by a string that can match one or more regex keys.

Therefore, it is difficult to understand which utility you will find for this.


If you can come up with a way to make sure that no two keys can match the same line, you can create a subclass of MutableMapping that applies this check when adding new keys and implements __getitem__ to scan through key-value pairs and returns the first value, where the argument matches the key regular expression. Again, this will be O(n) .

+3
source share

Of course. Just look at them as usual and check for matches.

 import re def find_matches(d, item): for k in d: if re.match(k, item): return d[k] d = {'a.*': 'a match', 'b.*': 'b match'} for item in ['apple', 'beer']: print(find_matches(d, item)) 

Result:

 a match b match 

Note that re.match only creates a match if an expression is found at the beginning of a line. Use re.search if it is normal to express anywhere in the string.

+1
source share

All Articles