OK I understood:
Place the separation pattern in the capture parser and will be included in the output. You can use either \ w + or \ W +:
>>> re.compile(r'(\w+)').split('hello, foo') ['', 'hello', ', ', 'foo', '']
To get rid of empty results, pass it through filter () with None as a filter function that will filter everything that does not evaluate to true:
>>> filter(None, re.compile(r'(\w+)').split('hello, foo')) ['hello', ', ', 'foo']
Edit: CMS indicates that if you use \ W + you do not need to use filter ()
source share