I have a naive "parser" that just does something like:
[x.split('=') for x in mystring.split(',')]
However, the sacrament may be something like 'foo=bar,breakfast=spam,eggs'
Obviously
A naive splitter just won't do it. For this, I'm limited to the Python 2.6 standard library ,
For example, pyparsing cannot be used.
Expected Result - [('foo', 'bar'), ('breakfast', 'spam,eggs')]
I am trying to do this with a regex, but I am facing the following problems:
My first attempt
r'([a-z_]+)=(.+),?'
Gave me
[('foo', 'bar,breakfast=spam,eggs')]
Obviously
Creating .+ Non-greedy does not solve the problem.
So,
I guess I need to somehow make the last comma (or $ ) mandatory.
Doing just that doesn't work,
r'([a-z_]+)=(.+?)(?:,|$)'
As well as the fact that the material behind the comma in the value containing one is omitted,
for example [('foo', 'bar'), ('breakfast', 'spam')]
I think I should use some kind of look-behind (?) Operation.
Question (s)
1. Which one do I use? or
2. How to do it / it?
Edit
Based on daramarak below,
In the end, I did almost the same thing that abarnert later proposed in a slightly more detailed form;
vals = [x.rsplit(',', 1) for x in (data.split('='))] ret = list() while vals: value = vals.pop()[0] key = vals[-1].pop() ret.append((key, value)) if len(vals[-1]) == 0: break
EDIT 2:
To satisfy my curiosity, is this really possible with pure regular expressions? I. so that re.findall() returns a list of 2 tuples?