Python.replace () regex

Question

Python.replace () regex

I am trying to grab everything after the </html> tag and remove it, but my code does not seem to do anything. Does.replace () not support regex?

Python

z.write(article.replace('</html>.+', '</html>'))

+73

python regex

user1442957 Jul 13 2018-12-18T00:

source share

4 answers

You can use the re module for regular expressions, but regular expressions are probably overflowing with what you want. I can try something like

 z.write(article[:article.index("</html>") + 7]

This is much cleaner and should be much faster than a regex based solution.

+6

Julian Jul 13 '12 at 19:01

source share

@ Ignacio is right, +1, I’ll just give more examples.

To replace text using a regular expression, use the re.sub function:

sub (pattern, repl, string [, count, flags])

It will replace irrevocable instances of pattern text passed as string . If you need to analyze compliance to extract information about specific group captures, for isntance you can pass a function to the string argument. more details here .

<strong> Examples

 >>> import re >>> re.sub(r'a', 'b', 'banana') 'bbnbnb' >>> re.sub(r'/\d+', '/{id}', '/andre/23/abobora/43435') '/andre/{id}/abobora/{id}'

+4

André Pena Jan 03 '17 at 16:02

source share

In this particular case, if the use of the re module is full, how about using the split (or rsplit ) method like

 se='</html>' z.write(article.split(se)[0]+se)

For example,

 #!/usr/bin/python article='''<html>Larala Ponta Monta </html>Kurimon Waff Moff ''' z=open('out.txt','w') se='</html>' z.write(article.split(se)[0]+se)

prints out.txt as

 <html>Larala Ponta Monta </html>

0

norio Jun 24 '17 at 20:08

source share

Ignacio Vazquez-Abrams · Accepted Answer · 2012-07-13 18:05

No. Regular expressions in Python are handled by the re module.

 article = re.sub(r'(?is)</html>.+', '</html>', article)

Python.replace () regex

More articles: