TL DR: How does a computational model model grammatical derivatives, so that there is an indefinite number of products for the same left side?
I am working on a project dealing with formal language theory and am trying to write a class to create regular grammar objects that can be passed to a finite machine. My naive attempt was to create an API to add products for each allowed input. A stripped-down version of my attempt is as follows (based on a formal definition of a formal grammar G = (N, Σ, P, S) ):
class ContextFreeGrammar: def __init__(self, variables, alphabet, production_rules, start_variable): self.variables = variables self.alphabet = alphabet self.production_rules = production_rules self.start_variable = start_variable def __repr__(self): return '{}({}, {}, {}, {})'.format( self.__class__.__name__, self.variables, self.alphabet, self.production_rules, self.start_variable ) class RegularGrammar(ContextFreeGrammar): _regular_expression_grammar = None
I still have not reached the point of writing a state machine or a push machine.
The grammar for regex has no context, so I included my definition in WSN below:
syntax = expression . expression = term "|" expression . expression = term . term = factor repetition term . term = factor term . term = . repetition = "*" . repetition = "+" . repetition = "?" . repetition = "{" nonnegative_integer "," nonnegative_integer "}" . repetition = "{" nonnegative_integer ",}" . repetition = "{," nonnegative_integer "}" . nonnegative_integer = nonzero_arabic_numeral arabic_numerals . nonnegative_integer = arabic_numeral . nonzero_arabic_numeral = "1" . nonzero_arabic_numeral = "2" . nonzero_arabic_numeral = "3" . nonzero_arabic_numeral = "4" . nonzero_arabic_numeral = "5" . nonzero_arabic_numeral = "6" . nonzero_arabic_numeral = "7" . nonzero_arabic_numeral = "8" . nonzero_arabic_numeral = "9" . arabic_numeral = nonzero_arabic_numeral . arabic_numeral = "0" . arabic_numerals = arabic_numeral . arabic_numerals = arabic_numeral arabic_numerals . factor = "(" expression ")" . factor = character_class . factor = character . escaped_character = "\\." . escaped_character = "\\(" . escaped_character = "\\)" . escaped_character = "\\+" . escaped_character = "\\*" . escaped_character = "\\?" . escaped_character = "\\[" . escaped_character = "\\]" . escaped_character = "\\\\" . escaped_character = "\\{" . escaped_character = "\\}" . escaped_character = "\\|" . character -> TODO ; character_class = TODO .
It is easy to see that I clearly break the alternations into separate pieces. I do this for ease of implementation. But I am fixated on how I should be engaged in personal lessons and the like. I wanted production_rules be a map from each left side to a set of each of its respective right sides. But now it seems unfeasible.