How to analyze human-entered logical properties, for example. is it and (this or that)?

As part of the upcoming project, I would like to configure it so that a specific domain object can be applied to tags or tag combinations.

I want users to be able to enter these combinations in a human-readable way, like:

  • tag-a and (tag-b or tag-c) β†’ apply to tag-tags + tag-b or tag-a + tag-c
  • tag-d or (tag-e and tag-f) -> applies to tag-d or tag-e + tag-f

Is there a set of tools for performing this kind of logical analysis from a single line of entered text? I could define tags behind the scenes with a certain difference ({}, [], etc.) so that they can be easily disassembled.

It’s just interesting how it’s best to parse this text, understandable for a person, into these different combinations of combinations without the need to enter each specific combination.

Thanks!

+6
source share
1 answer

This usually involves two steps: lexing (short for lexical analysis) and parsing.

At the first stage, the input string is converted into a sequence of lexical elements called tokens. For this purpose, you can declare an enumeration type for different types of tokens, for example:

public enum TokenType { OpenParenthesis, CloseParenthesis, And, Or, Tag } 

and class for tokens:

 sealed class Token { public TokenType Type { get; private set; } public string Item { get; private set; } public Token(TokenType type, string item) { Type = type; Item = item; } } 

Now you are writing an algorithm that turns an input string, for example. tag-a and (tag-b or tag-c) , in a sequence of Token instances. You can use regular expressions to recognize various elements, for example, @"\s*\(\s*" will be a regular expression to recognize open parentheses. The sequence will look something like this:

  • new Token(TokenType.Tag, "tag-a")
  • new Token(TokenType.And, null)
  • new Token(TokenType.OpenParenthesis, null)
  • new Token(TokenType.Tag, "tag-b")
  • new Token(TokenType.Or, null)
  • new Token(TokenType.Tag, "tag-c")
  • new Token(TokenType.CloseParenthesis, null)

Once you have this sequence, you need to run the parser. There are many strategies for parsing such expressions; To start, I recommend a recursive descent parser .

Of course, you will need several classes to contain the parse tree:

 abstract class Node { } enum BooleanOperator { And, Or } sealed class BooleanNode : Node { public BooleanOperator Operator { get; private set; } public Node Left { get; private set; } public Node Right { get; private set; } public BooleanNode(BooleanOperator op, Node left, Node right) { Operator = op; Left = left; Right = right; } } sealed class TagNode : Node { public string Tag { get; private set; } public TagNode(string tag) { Tag = tag; } } 

And then a recursive descent analyzer might look something like this:

 public static Node ParseExpression(Token[] tok) { int i = 0; return parseExpressionBoolOr(tok, ref i); } private static Node parseExpressionBoolOr(Token[] tok, ref int i) { var left = parseExpressionBoolAnd(tok, ref i); while (tok[i].Type == TokenType.Or) { i++; var right = parseExpressionBoolAnd(tok, ref i); left = new BooleanNode(BooleanOperator.Or, left, right); } return left; } private static Node parseExpressionBoolAnd(Token[] tok, ref int i) { var left = parseExpressionPrimary(tok, ref i); while (tok[i].Type == TokenType.And) { i++; var right = parseExpressionPrimary(tok, ref i); left = new BooleanNode(BooleanOperator.And, left, right); } return left; } private static Node parseExpressionPrimary(Token[] tok, ref int i) { if (tok[i].Type == TokenType.OpenParenthesis) { i++; var node = parseExpressionBoolOr(tok, ref i); if (tok[i].Type != TokenType.CloseParenthesis) throw new InvalidOperationException(); // or customised parse exception return node; } else if (tok[i].Type == TokenType.Tag) { var node = new TagNode(tok[i].Item); i++; return node; } else throw new InvalidOperationException(); // or customised parse exception } 

Please note that this is a very simplified example. However, it is as flexible as possible: you can expand this algorithm to analyze absolutely any language you want.

+7
source

Source: https://habr.com/ru/post/925602/


All Articles