How to replace regular expression with lowercase in python

I want to find keywords (the keys will be dynamic) and replace them in a specific format. For example: this data

keys = ["cat", "dog", "mouse"] text = "Cat dog cat cloud miracle DOG MouSE" 

had to convert to

 converted_text = "[Cat](cat) [dog](dog) [cat](cat) cloud miracle [DOG](dog) [MouSE](mouse)" 

Here is my code:

 keys = "cat|dog|mouse" p = re.compile(u'\\b(?iu)(?P<name>(%s))\\b' % keys) converted_text = re.sub(p, '[\g<name>](\g<name>)', text) 

And this works fine, only I cannot convert the last parameter to lowercase. This translates as follows:

 converted_text = "[Cat](cat) [dog](dog) [cat](cat) cloud miracle [DOG](dog) [MouSE](mouse)" 

how can i convert the last parameter to lowercase? it seems that python cannot compile the \ L character.

+6
python regex
source share
3 answers

You can use the function to replace:

 pattern = re.compile('|'.join(map(re.escape, keys)), re.IGNORECASE) def format_term(term): return '[%s](%s)' % (term, term.lower()) converted_text = pattern.sub(lambda m: format_term(m.group(0)), text) 
+10
source share

no need to use regular expression

 >>> keys = ["cat", "dog", "mouse"] >>> text = "Cat dog cat cloud miracle DOG MouSE" >>> for w in text.split(): ... if w.lower() in keys: ... print "[%s]%s" %(w,w.lower()), ... else: ... print w, ... [Cat]cat [dog]dog [cat]cat cloud miracle [DOG]dog [MouSE]mouse 
+3
source share

From your proposed solution, I suppose I don’t need to keep the keys in a list (I will use the set to speed up the search). This answer also assumes that all words in the text are separated by a space (which I will use to join them). Give them you can use:

 >>> keys = (["cat", "dog", "mouse"]) >>> text = "Cat dog cat cloud miracle DOG MouSE" >>> converted = " ".join(("[%s](%s)" % (word, word.lower()) if word.lower() in keys else word) for word in text.split()) >>> converted '[Cat](cat) [dog](dog) [cat](cat) cloud miracle [DOG](dog) [MouSE](mouse)' 

Of course, this calls word.lower () twice. You can avoid this (and still use a similar approach) by using two list contexts (or, in fact, generator expressions):

 >>> converted = " ".join(("[%s](%s)" % (word, lower) if lower in keys else word) for word, lower in ((w, w.lower()) for w in text.split())) >>> converted '[Cat](cat) [dog](dog) [cat](cat) cloud miracle [DOG](dog) [MouSE](mouse)' 
+1
source share

All Articles