.* does not match newlines unless the re.S flag is specified:
re.findall(r'\\begin{abstract}(.*?)\\end{abstract}', data, re.S)
Example
Consider this test file:
\documentclass{report} \usepackage[margin=1in]{geometry} \usepackage{longtable} \begin{document} Title maybe \begin{abstract} Good stuff \end{abstract} Other stuff \end{document}
This gets the abstract:
>>> import re >>> data = open('a.tex').read() >>> re.findall(r'\\begin{abstract}(.*?)\\end{abstract}', data, re.S) ['\nGood stuff\n']
Documentation
On the web page of the re module:
re.S
re.DOTALL
Make a '.' a special character matches any character in everything, including a new line; without this flag ". will match anything but a new line.
source share