I am trying to split a string as aaa:bbb(123)tokens using Pyparsing.
I can do this with regex, but I need to do this through Pyparsing.
C resolution would look like this:
>>> import re
>>> string = 'aaa:bbb(123)'
>>> regex = '(\S+):(\S+)\((\d+)\)'
>>> re.match(regex, string).groups()
('aaa', 'bbb', '123')
It is clear and simple enough. The key point here is \S+what it means "everything but spaces."
Now I will try to do this with Pyparsing:
>>> from pyparsing import Word, Suppress, nums, printables
>>> expr = (
... Word(printables, excludeChars=':')
... + Suppress(':')
... + Word(printables, excludeChars='(')
... + Suppress('(')
... + Word(nums)
... + Suppress(')')
... )
>>> expr.parseString(string).asList()
['aaa', 'bbb', '123']
Ok, we got the same result, but it doesnβt look very good. We set excludeCharsPyparsing expressions to stop where we need them, but that doesn't seem reliable. If we have "excluded" characters in the original string, then the same regular expression will work fine:
>>> string = 'a:aa:b(bb(123)'
>>> re.match(regex, string).groups()
('a:aa', 'b(bb', '123')
while the pyparsing exception will obviously break:
>>> expr.parseString(string).asList()
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/long/path/to/pyparsing.py", line 1111, in parseString
raise exc
ParseException: Expected W:(0123...) (at char 7), (line:1, col:8)
, , Pyparsing?