Python regex search and replace

I have a log file full of tweets. Each tweet is on a separate line, so I can iterate the file easily.

An example tweet would be like this:

@ sample This is a sample string $ 1.00 # sample

I want to remove this a bit by removing the space between the special character and the next alphanumeric character. "@s", "$ 1", "# s"

To make it look like this:

@sample This is a sample string $1.00 #sample

I am trying to use regular expressions to match these instances because they can be variables, but I'm not sure how to do this.

I use re.sub () and re.search () to find instances, but I try to figure out how to remove only empty space, leaving the string intact.

Here is the code that I still have:

#!/usr/bin/python

import csv
import re
import sys
import pdb
import urllib

f=open('output.csv', 'w')

with open('retweet.csv', 'rb') as inputfile:
    read=csv.reader(inputfile, delimiter=',')
    for row in read:
        a = row[0]
        matchObj = re.search("\W\s\w", a)
        print matchObj.group()

f.close()

Thanks for any help!

+4
3

- re.sub:

>>> import re
>>> strs = "@ sample This is a sample string $ 1.00 # sample"
>>> re.sub(r'([@#$])(\s+)([a-z0-9])', r'\1\3', strs, flags=re.I)
'@sample This is a sample string $1.00 #sample'
+5
>>> re.sub("([@$#]) ", r"\1", "@ sample This is a sample string $ 1.00 # sample")
'@sample This is a sample string $1.00 #sample'
+1

.

print re.sub(r'([@$])\s+',r'\1','@ blah $ 1')
0

All Articles