Split string based on regex

I have command output in tabular form. I parse this result from the result file and save it in a line. Each item on the same line is separated by one or more white space characters, so I use regular expressions to match 1 or more spaces and break them. However, a space is inserted between each element:

>>> str1="abcd" # spaces are irregular >>> str1 'abcd' >>> str2=re.split("( )+", str1) >>> str2 ['a', ' ', 'b', ' ', 'c', ' ', 'd'] # 1 space element between!!! 

Is there a better way to do this?

After each addition of str2 to the list.

+62
python regex
Jun 11 '12 at 5:40
source share
4 answers

Using ( , ) , you capture a group, if you just delete them, you will not have this problem.

 >>> str1 = "abcd" >>> re.split(" +", str1) ['a', 'b', 'c', 'd'] 

However, there is no need for a regular expression, str.split without specifying the specified delimiter will split this into a space for you. That would be the best way in this case.

 >>> str1.split() ['a', 'b', 'c', 'd'] 

If you really need a regular expression, you can use it ( '\s' represents spaces and it is clearer):

 >>> re.split("\s+", str1) ['a', 'b', 'c', 'd'] 

or you can find all characters without spaces

 >>> re.findall(r'\S+',str1) ['a', 'b', 'c', 'd'] 
+86
Jun 11 2018-12-12T00:
source share

The str.split method str.split automatically remove all spaces between elements:

 >>> str1 = "abcd" >>> str1.split() ['a', 'b', 'c', 'd'] 

The docs are here: http://docs.python.org/library/stdtypes.html#str.split

+13
Jun 11 2018-12-12T00:
source share

When you use re.split and the split template contains capture groups, the groups are saved in the output. If you do not want this, use a group that is not in the recording instead.

+5
Jun 11 2018-12-12T00:
source share

It is very simple. Try the following:

 str1="abcd" splitStr1 = str1.split() print splitStr1 
+1
Jun 11 2018-12-12T00:
source share



All Articles