RegEx Parser is written in Prolog

For several hours now, I have been banging my head against the wall at this domestic problem. We need to parse the regex using Prolog. For the most part, predicates I have work to do, but there are a few regular expressions and string combos that make them escape from the stack space in SWI-Prolog. Here's a sample with two combinations of Regex strings that works, and one that doesn't:

star(star(char(a))), [] star(star(char(a))), [a] 

The first one works, and the second one from the stack.

Here are the predicates that I use:

 re_match(epsilon, []). re_match(char(Letter), [Letter]). re_match(star(_), []). re_match(seq(Rx1, Rx2), List) :- append(List1, List2, List), re_match(Rx2, List2), re_match(Rx1, List1). re_match(alt(Rx1, Rx2), List) :- re_match(Rx1, List); re_match(Rx2, List). re_match(star(Rx), List) :- append(List1, List2, List), re_match(Rx, List1), re_match(star(Rx), List2). 

I'm not sure what change I need to make to make it work correctly, but I'm not sure what else to do.

In addition, changing the list: - append (List1, List2, List) to [H | T] is not evaluated as true for one example.

+7
source share
2 answers

I do not have access to SWI Prolog right now, but there is an assumption:

Try to change

 re_match(star(Rx), List) :- append(List1, List2, List), re_match(Rx, List1), re_match(star(Rx), List2). 

to

 re_match(star(Rx), List) :- append([H|List1], List2, List), re_match(Rx, [H|List1]), re_match(star(Rx), List2). 

to make re_match “eat something” when iterating over the star’s design.

+5
source

Consider using DCG notation for better readability and more easily learn about completion properties:

 :- op(100, xf, *). rexp(eps) --> []. rexp([T]) --> [T]. rexp(_*) --> []. rexp(R*) --> rexp(R), rexp(R*). rexp(s(R1,R2)) --> rexp(R1), rexp(R2). rexp((R1|R2)) --> ( rexp(R1) ; rexp(R2) ). 

An example of using length / 2 to generate longer lists to generate strings that match a regular expression:

 ?- length(Ls, _), phrase(rexp(s(([a]|[b]),[c]*)), Ls). Ls = [a] ; Ls = [b] ; Ls = [a, c] ; Ls = [b, c] ; Ls = [a, c, c] ; etc. 
+7
source

All Articles