I have a log file full of tweets. Each tweet is on a separate line, so I can iterate the file easily.
An example tweet would be like this:
@ sample This is a sample string $ 1.00 # sample
I want to remove this a bit by removing the space between the special character and the next alphanumeric character. "@s", "$ 1", "# s"
To make it look like this:
@sample This is a sample string $1.00
I am trying to use regular expressions to match these instances because they can be variables, but I'm not sure how to do this.
I use re.sub () and re.search () to find instances, but I try to figure out how to remove only empty space, leaving the string intact.
Here is the code that I still have:
import csv
import re
import sys
import pdb
import urllib
f=open('output.csv', 'w')
with open('retweet.csv', 'rb') as inputfile:
read=csv.reader(inputfile, delimiter=',')
for row in read:
a = row[0]
matchObj = re.search("\W\s\w", a)
print matchObj.group()
f.close()
Thanks for any help!