How can I execute strtok () - type parsing in Python?

Title How do I do what strtok () does in C, in Python? assumes it should answer my question, but the specific behavior of strtok () I'm looking for is split into any of the characters of the separator string. That is, given:

const char* delim = ", "; str1 = "123,456"; str2 = "234 567"; str3 = "345, 678"; 

strtok () finds substrings of digits regardless of the number of characters from delim. The Python partition expects the entire demarcation line to be there, so I cannot do this:

 delim = ', ' "123,456".split(delim) 

since it does not find delim as a substring and returns a list of one element.

+4
source share
2 answers

If you know that tokens will be numbers, you should use the split function from the Python re module:

 import re re.split("\D+", "123,456") 

More generally, you can match any of the delimiter characters:

 re.split("[ ,]", "123,456") 

or

 re.split("[" + delim + "]", "123,456") 
+4
source

Using replace() to normalize your delimiters to the same character and split() -ting on that character is one way to solve simpler cases. For your examples, replace(',',' ').split() should work (converting commas to spaces, and then using a special form without split arguments to separate spaces into spaces).

In Python, when things start to get too complicated for split and replace , you usually refer to the re module; see Sam Mussmann for a more general answer.

+1
source

All Articles