Using decode () vs regex to cancel this line

I have the following line, and I'm trying to find the best practice to cancel it.

The solution should be somewhat flexible in that I get this input from the API, and I cannot be absolutely sure that the existing character structure ( \n unlike \r ) will always be the same.

'"If it ain\'t broke, don\'t fix it." \nWent in for a detailed car wash.\nThe attendants raved-up my engine when taking the car into the tunnel. NOTE: my car is...'

This regular expression looks like it should work:

 text_excerpt = re.sub(r'[\s"\\]', ' ', raw_text_excerpt).strip() 

I read that decode() can work (and be the best solution overall).

 raw_text_excerpt.decode('string_unescape') 

I tried something in this direction, and it did not work. Any suggestions? Is regex better here?

+7
source share
1 answer

The codec you are looking for is string-escape :

 >>> print "\\'".decode("string-escape") ' 

I am not sure which version they added, though ... there may be an older version that you are using and it is not. I am running:

 Python 2.6.6 (r266:84292, Mar 25 2011, 19:36:32) [GCC 4.5.2] on linux2 
+16
source

All Articles