std::string stripComment(std::string str) {
bool escaped = false;
bool inSingleQuote = false;
bool inDoubleQuote = false;
for(std::string::const_iterator it = str.begin(); it != str.end(); it++) {
if(escaped) {
escaped = false;
} else if(*it == '\\' && (inSingleQuote || inDoubleQuote)) {
escaped = true;
} else if(inSingleQuote) {
if(*it == '\'') {
inSingleQuote = false;
}
} else if(inDoubleQuote) {
if(*it == '"') {
inDoubleQuote = false;
}
} else if(*it == '\'') {
inSingleQuote = true;
} else if(*it == '"') {
inDoubleQuote = true;
} else if(*it == '#') {
return std::string(str.begin(), it);
}
}
return str;
}
EDIT: or more training FSM,
std::string stripComment(std::string str) {
int states[5][4] = {
{0, 0, 1, 2,}
{1, 3, 0, 1,},
{2, 4, 2, 0,},
{1, 1, 1, 1,},
{2, 2, 2, 2,},
};
int state = 0;
for(std::string::const_iterator it = str.begin(); it != str.end(); it++) {
switch(*it) {
case '\\':
state = states[state][1];
break;
case '\'':
state = states[state][2];
break;
case '"':
state = states[state][3];
break;
case '#':
if(!state) {
return std::string(str.begin(), it);
}
default:
state = states[state][0];
}
}
return str;
}
The array statesdefines the transition between FSM states.
The first index - the current state, 0, 1, 2, 3or 4.
The second index corresponding symbol \, ', "or other symbol.
The array reports the next state based on the current state and symbol.
FYI, they assume the backslash excludes any character in the string. You at least need to avoid backslashes, so you might have a line ending with a backslash.