You need to create a regular expression in Javascript to validate the correct conditional string

I want to create a regex in javascript that will check for a valid conditional string like

-1 OR (1 AND 2) AND 1 -1 OR (1 AND 2) -1 OR 2 -1 OR 1 OR 1 -1 AND 1 AND 1 

The string must not contain "AND" and "OR". For example - 1 OR 2 AND 3 is not valid . -It should be (1 OR 2) and 3 or 1 or (2 AND 3) .

I tried the following regex. It works for most conditions, but does not check the above condition.

 /^(\s*\(\d+\s(AND|OR)\s\d+\)|\s*\d+)((\s*(AND|OR)\s*)(\(\d+\s(AND|OR)\s\d+\)|\s*\d+))*$/ 

Someone can help me deal with the above problem.

+6
source share
2 answers

Forget about regular expressions; they cannot do this.

Parser Rescue Generators

With a parser generator, you can create a grammar that is understandable and supported .

Here is a parser generator for JavaScript with an online demo.

Grammar

From what I understand, you want to not want any implicit priority rules between AND and OR .

Here is an example of what he considers valid:

 -1 OR 2 OR (2 AND 2 AND (2 OR (6 AND -2 AND (6 OR 2) AND (6 OR 2)) OR 2 OR 2)) 

Grammar currently requires / supports

  • infinite nesting
  • explicit parenthesized priority control for AND / OR
  • (multiple) negation of literals
  • spaces between operands and operators

Grammar can easily be changed to

  • allow arbitrary spaces
  • optional negation of literals instead of possible multiple negations
  • denial of any subexpression

If you want a more detailed explanation or cannot figure out how to customize it to your liking, just write a comment.

Here is your grammar, just paste it into the online generator and click Download parser .

 start = formula formula = ors / ands / literal / parens_formula parens_formula = "(" formula ")" ors = operand (whitespace "OR" whitespace operand)+ ands = operand (whitespace "AND" whitespace operand)+ whitespace = " "+ operand = literal / parens_formula literal = integer / "-" literal integer "integer" = digits:[0-9]+ { return parseInt(digits.join(""), 10); } 
+5
source

Interest Ask. And phant0m's answer is very educational! (and should be used if you understand parsers).

If you want to do this using only a regular expression, the following solution correctly validates an arbitrary nested logical operator using JavaScript.

Rules / Assumptions:

  • A valid statement consists only of numbers, brackets, spaces, the logical operator AND and the logical operator OR .
  • An operator must contain at least two “tokens” separated by a logical operator, where each token is either a “number” or a “bracket”.
  • The number marker is a numeric integer that has one or more decimal digits preceded by an optional sign (either + or - ).
  • A token with “brackets” represents two or more tokens separated by a logical operator enclosed in matching matching opening and closing parentheses.
  • The application as a whole may contain more than two tokens, but all tokens must be divided by the same single operator; either AND or OR .
  • Each unit in brackets may contain more than two tokens, but all tokens must be separated by the same single operator; either AND or OR .
  • Any number of spaces can be used between any elements (parentheses, numbers and logical operators), but at least one space is required between numbers and a logical operator.
  • The logical operators AND and OR not case sensitive.

Examples of valid logical operators:

 "1 AND 2" "1 AND 2 AND 3" "1 OR 2" "-10 AND -20" "100 AND +200 AND -300" "1 AND (2 OR 3)" "1 AND (2 OR 3) AND 4" "1 OR ((2 AND 3 AND 4) OR (5 OR 6 OR 7))" "( 1 and 2 ) AND (1 AND 2)" 

Examples of invalid logical operators:

 "1x" // Invalid character. "1 AND" // Missing token. "1 AND 2 OR 3" // Mixed logical operators. "(1" // Unbalanced parens. "(((1 AND 2)))" // Too many parens. "(1 AND) (2)" // Missing token. "1" // Missing logical operator and second number "1OR2OR3OR4" // Missing spaces between numbers and operators. "(1) AND (2)" // Invalid parentheses. 

Regular Expression Solution:

This problem requires matching nested constructions in parentheses, and the JavaScript regex mechanism does not support recursive expressions, so this problem cannot be solved in one hit using a single regular expression. However, the problem can be simplified into two parts, each of which can be solved using a single JavaScript expression. The first regular expression matches the internal brackets, and the second checks the simplified logical operator (which does not have parentheses).

Regex # 1: match the innermost bracket.

The following regular expression corresponds to one unit in brackets, which consists of two or more tokens of a number, where all numbers are separated by either AND or OR , with at least one space between numbers and logical operators. The regular expression is fully commented and formatted for readability in the syntax of the free space mode PHP:

 $re_paren = '/ # Match innermost "parenthesized unit". \( # Start of innermost paren group. \s* # Optional whitespace. [+-]?\d+ # First number token (required). (?: # ANDs or ORs (required). (?: # Either multiple AND separated values. \s+ # Required whitespace. AND # Logical operator. \s+ # Required whitespace. [+-]?\d+ # Additional number. )+ # multiple AND separated values. | (?: # Or multiple OR separated values. \s+ # Required whitespace. OR # Logical operator. \s+ # Required whitespace. [+-]?\d+ # Additional number token. )+ # multiple OR separated values. ) # ANDs or ORs (required). \s* # Optional whitespace. \) # End of innermost paren group. /ix'; 

Regex # 2: checking a simplified logical operator.

Here (almost identical, with the exception of boundary anchors) is a regular expression that checks a simplified logical operator (having only numbers and logical operators and without parentheses). Here it is in the commented-out syntax of free space mode (PHP):

 $re_valid = '/ # Validate simple logical statement (no parens). ^ # Anchor to start of string. \s* # Optional whitespace. [+-]?\d+ # First number token (required). (?: # ANDs or ORs (required). (?: # Either multiple AND separated values. \s+ # Required whitespace. AND # Logical operator. \s+ # Required whitespace. [+-]?\d+ # Additional number. )+ # multiple AND separated values. | (?: # Or multiple OR separated values. \s+ # Required whitespace. OR # Logical operator. \s+ # Required whitespace. [+-]?\d+ # Additional number token. )+ # multiple OR separated values. ) # ANDs or ORs (required). \s* # Optional whitespace. $ # Anchor to end of string. /ix'; 

Note that these two regular expressions are identical, with the exception of the boundary anchors.

JavaScript solution:

The tested JavaScript function below uses two of the above expressions to solve the problem:

 function isValidLogicalStatement(text) { var re_paren = /\(\s*[+-]?\d+(?:(?:\s+AND\s+[+-]?\d+)+|(?:\s+OR\s+[+-]?\d+)+)\s*\)/ig; var re_valid = /^\s*[+-]?\d+(?:(?:\s+AND\s+[+-]?\d+)+|(?:\s+OR\s+[+-]?\d+)+)\s*$/ig; // Iterate from the inside out. while (text.search(re_paren) !== -1) { // Replace innermost parenthesized units with integer. text = text.replace(re_paren, "0"); } if (text.search(re_valid) === 0) return true; return false; } 

The function uses an iterative technique for the first match and replacement of the inner blocks in brackets, replacing them with a single token of a number, and then we check to see if the resulting statement is really (without parentheses).

Application: 2012-11-06

In a comment on this answer, the OP now says that there must be spaces between numbers and operators, and the number or brackets may NOT stand on their own. Given these additional requirements, I updated the answer above.

+4
source

All Articles