How to process LaTex file

I just finished writing a summary for latex calculus.

The main problem now is that the files contain a lot of things that I don’t need right now.

. .Tex files contain many definitions and theorems that I need to learn by heart.

Definitions have their own definition in a tex file, so any definition in a file starts with:

\begin{definition} 

and ends on

 \end{definition} 

And the same for theorems.

I need to write something to take out everything inside \begin{}...\end{} .

For example, in list A:

 \begin{document} \begin{center} \begin{definition} Hello WOrld! \end{definition} \begin{example}A+B \end{example} \begin{theorem} Tre Capre \end{theorem} \begin{definition} Hello WOrld2! \end{definition} \end{center} \end{document} 

should contain: [[\begin{definition} Hello WOrld! \end{definition}],[\begin{theorem} Tre Capre \end{theorem}],[\begin{definition} Hello WOrld2! \end{definition}]] [[\begin{definition} Hello WOrld! \end{definition}],[\begin{theorem} Tre Capre \end{theorem}],[\begin{definition} Hello WOrld2! \end{definition}]]

Looking at this site, I found that I can use regular expressions:

 for i in range(5): x = i+1 raw = open('tex/chapter' + str(x) + '.tex') A = [] for line in raw: A.append(re.match(r'(\begin{definition})://.*\.(\end{definition})$', line)) print(A) 

but the result is just None and I really don't know why

Edit:

 import re for i in range(5): x = i+1 raw = open('tex/chapter' + str(x) + '.tex') A = re.findall(r'\\begin{definition}(.*?)\\end{definition}', raw.read()) print(A) 

The conclusion is as follows:

 [] [] [] [] [] 
+3
source share
1 answer

From what I get from the question, you just need the definitions from the Latex file. You can use findall to directly get your definitions:

 A = re.findall(r'{definition}(.*?)\\end{definition}', raw.read()) 

Pay attention to use .*? to solve greedy regex matching

+4
source

All Articles