How to write a regular expression to match a string literal, where escape is a double quote character?

I am writing a parser using ply , which should identify FORTRAN string literals. They are quoted with single quotes, with the escape character doubled with single quotes. i.e.

'I don''t understand what you mean'

is a valid string in FORTRAN format.

Ply accepts regex input. My attempt is not working yet, and I do not understand why.

t_STRING_LITERAL = r"'[^('')]*'"

Any ideas?

+7
python regex fortran ply
source share
4 answers

String literal:

  • An open single quote followed by:
  • Any number of double single quotes and non single quotes, then
  • Closed single quote.

So our regex is:

 r"'(''|[^'])*'" 
+20
source share

You want something like this:

 r"'([^']|'')*'" 

This suggests that inside single quotes, you can have either double quotes or a character without quotes.

Brackets define a character class in which you specify characters that may or may not match. This does not allow anything more complicated, therefore, trying to use parentheses and match a multi-character sequence ('') does not work. Instead, your character class [^('')] equivalent to [^'()] , i.e. Matches any that is not a single quote, nor a left or right bracket.

+4
source share

It is usually easy to get something quick and dirty for parsing specific string literals that give you problems, but for a general solution, you can get a very powerful and complete regular expression for string literals from the pyparsing module :

 >>> import pyparsing >>> pyparsing.quotedString.reString '(?:"(?:[^"\\n\\r\\\\]|(?:"")|(?:\\\\x[0-9a-fA-F]+)|(?:\\\\.))*")|(?:\'(?:[^\'\\n\\r\\\\]|(?:\'\')|(?:\\\\x[0-9a-fA-F]+)|(?:\\\\.))*\')' 

I'm not sure about the significant differences between FORTRAN and Python string literals, but this is a handy reference if nothing else.

0
source share
 import re ch ="'I don''t understand what you mean' and you' ?" print re.search("'.*?'",ch).group() print re.search("'.*?(?<!')'(?!')",ch).group() 

result

 'I don' 'I don''t understand what you mean' 
0
source share

All Articles