I am learning Rust and trying to write a simple tokenizer right now. I want to go through the line that executes each regular expression against the current position in the line, create a token, then skip ahead and repeat until I process the entire line. I know that I can put them in a larger regular expression and cyclically capture them, but I need to process them separately for domain resons.
However, I do not see anywhere in the regex box that allows offset, so I can start matching again at a specific point.
extern crate regex;
use regex::Regex;
fn main() {
let input = "3 + foo/4";
let ident_re = Regex::new("[a-zA-Z][a-zA-Z0-9]*").unwrap();
let number_re = Regex::new("[1-9][0-9]*").unwrap();
let ops_re = Regex::new(r"[+-*/]").unwrap();
let ws_re = Regex::new(r"[ \t\n\r]*").unwrap();
let mut i: usize = 0;
while i < input.len() {
// Here check each regex to see if a match starting at input[i]
// if so copy the match and increment i by length of match.
}
}
, , . , ( ), .