Python: how can I include a separator in line breaks?

I would like to split a line with multiple delimiters, but keep the delimiters in the resulting list. I think this is a useful thing to take the initial step in analyzing any formula, and I suspect there is a good Python solution.

Someone asked a similar question in Java here .

For example, a typical split is as follows:

>>> s='(twoplusthree)plusfour' >>> s.split(f, 'plus') ['(two', 'three)', 'four'] 

But I'm looking for a good way to add a plus back (or save it):

 ['(two', 'plus', 'three)', 'plus', 'four'] 

Ultimately, I would like to do this for each statement and parenthesis, so if there is a way to get

 ['(', 'two', 'plus', 'three', ')', 'plus', 'four'] 

all at one time, then everything is better.

+6
source share
4 answers

You can do this with the Python re module.

 import re s='(twoplusthree)plusfour' list(filter(None, re.split(r"(plus|[()])", s))) 

You can leave a list if you only need an iterator.

+11
source
 import re s = '(twoplusthree)plusfour' l = re.split(r"(plus|\(|\))", s) a = [x for x in l if x != ''] print a 

output:

 ['(', 'two', 'plus', 'three', ')', 'plus', 'four'] 
+4
source

Here is an easy way: re.split :

 import re s = '(twoplusthree)plusfour' re.split('(plus)', s) 

Output:

 ['(two', 'plus', 'three)', 'plus', 'four'] 

re.split very similar to string.split , except that instead of a literal delimiter, you pass in a regular expression pattern. The trick here is to place () around the template so that it is retrieved as a group.

Keep in mind that you will have blank lines if there are two consecutive occurrences of the separator pattern

+3
source

this branch is old, but since its top Google result, I thought to add this:

If you do not want to use regex, there is an easier way to do this. basically just cause splitting, but return the separator except the last token

 def split_keep_deli(string_to_split, deli): result_list = [] tokens = string_to_split.split(deli) for i in xrange(len(tokens) - 1): result_list.append(tokens[i] + deli) result_list.append(tokens[len(tokens)-1]) return result_list 
0
source

All Articles