Replace all occurrences of certain words

Question

Replace all occurrences of certain words

Suppose I have the following sentence:

bean likes to sell his beans

and I want to replace all occurrences of certain words with other words. For example, from bean to robert and beans to cars .

I can't just use str.replace , because in this case it will change the beans value to roberts .

 >>> "bean likes to sell his beans".replace("bean","robert") 'robert likes to sell his roberts'

I need to change only whole words, not the occurrence of this word in another word. I think I can achieve this using regular expressions, but I don’t know how to do it correctly.

+8

python python-2.7 regex

Frozenheart Sep 2 '14 at 20:19

source share

4 answers

If you replace each word one at a time, you can replace the words several times (and not get what you want). To avoid this, you can use a function or lambda:

 d = {'bean':'robert', 'beans':'cars'} str_in = 'bean likes to sell his beans' str_out = re.sub(r'\b(\w+)\b', lambda m:d.get(m.group(1), m.group(1)), str_in)

Thus, as soon as the bean is replaced by robert , it will not be changed again (even if robert also in your word entry list).

As suggested by georg , I edited this answer using dict.get(key, default_value) . Alternative solution (also suggested by georg ):

 str_out = re.sub(r'\b(%s)\b' % '|'.join(d.keys()), lambda m:d.get(m.group(1), m.group(1)), str_in)

+3

seb Sep 2 '14 at 20:38

source share

 "bean likes to sell his beans".replace("beans", "cars").replace("bean", "robert")

Replaces all instances of "beans" with "cars" and "bean" with "robert". This works because .replace() returns a modified instance of the source string. So you can think about it in stages. It essentially works as follows:

  >>> first_string = "bean likes to sell his beans" >>> second_string = first_string.replace("beans", "cars") >>> third_string = second_string.replace("bean", "robert") >>> print(first_string, second_string, third_string) ('bean likes to sell his beans', 'bean likes to sell his cars', 'robert likes to sell his cars')

-one

Kevin london Sep 2 '14 at 20:22

source share

I know that it was a long time ago, but does it look much more elegant?

 reduce(lambda x,y : re.sub('\\b('+y[0]+')\\b',y[1],x) ,[("bean","robert"),("beans","cars")],"bean likes to sell his beans")

-one

Akshay Hazari Nov 03 '15 at 4:08

source share

Alex Riley · Accepted Answer · 2014-09-02T20:24:11+0000

If you use regular expression, you can specify word boundaries with \b :

 import re sentence = 'bean likes to sell his beans' sentence = re.sub(r'\bbean\b', 'robert', sentence) # 'robert likes to sell his beans'

Here, 'beans' does not change (to 'roberts'), because the 's' at the end is not the boundary between the words: \b matches the empty string, but only at the beginning or end of the word.

Second replacement of completeness:

 sentence = re.sub(r'\bbeans\b', 'cars', sentence) # 'robert likes to sell his cars'

Replace all occurrences of certain words

More articles: