Regex for C ++ enumeration parsing

Question

Regex for C ++ enumeration parsing

How to create a regex for parsing C ++ lists? The listings I tried looked like

enum Temperature { C = 0, F=1, // some elements are commented R, // most elements are not gived a value K // sometimes the last element is succeeded by a comma } temperature; // different indent style is used enum Depth { m = 0, ft = 1, } depth;

I tried a few simple patterns, but none of them are general enough to catch all of the above cases.

Any regexp helper that can help me?

Edit: to clarify, I need a name and value, for example. C and 0.

+4

enums regex parsing

Waws Aug 23 '11 at 11:30

source share

2 answers

nobody · Answer 1 · 2011-08-23T12:20:05+0000

It was difficult :) The following is the best I could come up with. Assuming that he is only assigned text between {and} , he captures all the names and corresponding values:

 /(\w+)\s*(?:=\s*(\d+)|)\s*,?\s*(?:(?:\n|$)|\/\/.*?(?:\n|$)|)/

godspeedlee · Answer 2 · 2012-08-06T18:09:28+0000

If we use a regular expression to match an enumeration, rather than use it to parse the enumeration. I think it's possible. try the following steps:

step1. make sure the C / C ++ source code can be compiled successfully.
step 2. remove all comments from the source code C / C ++.
step 3. match enum

working Ruby example code:

 # copy from Mastering Regular Expression 3rd COMMENT = '/\*[^\*]*\*+(?:[^/*][^*]*\*+)*/' COMMENT2 = '//[^\n]+' DOUBLE = '"(?:\\.|[^\\"])*"' SINGLE = '\'(?:\\.|[^\\\'])*\'' # pattern for match enum ENUM = '\benum\s*(\w+)\s*\{(\s*\w+(?:\s*=\s*\w+)?(?:\s*,\s*\w+(?:\s*=\s*\w+)?)*)\s*(?:,\s*)?\}\s*\w+\s*;' foo = File.open("foo.cpp", "r").read() # strip all comments from foo.cpp foo.gsub!(/(#{DOUBLE}|#{SINGLE})|#{COMMENT}|#{COMMENT2}/, '\1') bar = [] # match enum... foo.scan(/#{ENUM}/) do | m | printf("%s: %s\n", m[0], m[1].gsub(/\s/, '')) end

output:

 Temperature: C=0,F=1,R,K Depth: m=0,ft=1

Regex for C ++ enumeration parsing

More articles: