I have a file structured as follows:
A: some text B: more text even more text on several lines A: and we start again B: more text more multiline text
I am trying to find a regex that will split my file as follows:
>>>re.findall(regex,f.read()) [('some text','more text','even more text\non several lines'), ('and we start again','more text', 'more\nmultiline text')]
So far I have received the following:
>>>re.findall('A:(.*?)\nB:(.*?)\n(.*?)',f.read(),re.DOTALL) [(' some text', ' more text', ''), (' and we start again', ' more text', '')]
Multi-line text is not displayed. I think this is because lazy selection is really lazy and doesnβt catch anything, but I take it out, the regular expression becomes really greedy:
>>>re.findall('A:(.*?)\nB:(.*?)\n(.*)',f.read(),re.DOTALL) [(' some text', ' more text', 'even more text\non several lines\nA: and we start again\nB: more text\nmore\nmultiline text')]
Does anyone have any ideas? Thanks!