Resolution of logical operations - AND, OR, cyclic conditions dynamically

I have an inbound filter saved with a logical clause as follows.

Acct1 = 'Y' AND Acct2 = 'N' AND Acct3 = 'N' AND Acct4 = 'N' AND Acct5 = 'N' AND ((Acct6 = 'N' OR Acct7 = 'N' AND Acct1 = 'Y') AND Formatted= 'N' AND Acct9 = 'N' AND (Acct10 = 'N' AND Acct11 = 'N') AND EditableField= 'N' ) 

My data entered in this section will be from a Csv file, as shown below.

 Country,Type,Usage,Acct1,Acct2,Acct3,Acct4,Acct5,Acct6,Acct7,Formatted,Acct9,Acct10,Acct11,EditableField USA,Premium,Corporate,Y,N,Y,N,N,N,Y,N,Y,N,Y,N, Mexico,Premium,Corporate,Y,N,Y,N,Y,N,Y,N,Y,N,Y,N, USA,Premium,Corporate,Y,N,Y,N,N,N,N,Y,Y,N,Y,N, USA,Premium,Corporate,Y,N,Y,N,Y,N,Y,Y,Y,N,Y,N, 

I will have to filter the entries in the file based on the conditions defined in the sentence. This is an example of one simple sentence, but there will be more internal conditions than this, and the sentence can be changed whenever the user wants, and there will be 10 such articles that entries must go through in sequence.

So, I'm looking for a way to dynamically interpret a sentence and apply it to incoming records. Please provide me your suggestions on how to design / any example, if available.

+6
source share
4 answers

Here's a complete solution that does not include third-party libraries such as ANTLR or JavaCC. Please note that although it is extensible, its capabilities are still limited. If you want to create much more complex expressions, it is better to use a grammar generator.

First write a tokenizer that splits the input string into tokens. Here are the types of tokens:

 private static enum TokenType { WHITESPACE, AND, OR, EQUALS, LEFT_PAREN, RIGHT_PAREN, IDENTIFIER, LITERAL, EOF } 

Token class itself:

 private static class Token { final TokenType type; final int start; // start position in input (for error reporting) final String data; // payload public Token(TokenType type, int start, String data) { this.type = type; this.start = start; this.data = data; } @Override public String toString() { return type + "[" + data + "]"; } } 

To simplify tokenization, create a regular expression that reads the following token from the input line:

 private static final Pattern TOKENS = Pattern.compile("(\\s+)|(AND)|(OR)|(=)|(\\()|(\\))|(\\w+)|\'([^\']+)\'"); 

Note that there are many groups in it, one group for TokenType in the same order (first comes WHITESPACE , then AND , etc.). Finally, the tokenizer method:

 private static TokenStream tokenize(String input) throws ParseException { Matcher matcher = TOKENS.matcher(input); List<Token> tokens = new ArrayList<>(); int offset = 0; TokenType[] types = TokenType.values(); while (offset != input.length()) { if (!matcher.find() || matcher.start() != offset) { throw new ParseException("Unexpected token at " + offset, offset); } for (int i = 0; i < types.length; i++) { if (matcher.group(i + 1) != null) { if (types[i] != TokenType.WHITESPACE) tokens.add(new Token(types[i], offset, matcher.group(i + 1))); break; } } offset = matcher.end(); } tokens.add(new Token(TokenType.EOF, input.length(), "")); return new TokenStream(tokens); } 

I am using java.text.ParseException . Here we apply the Matcher regex to the end of the input. If it does not match the current position, we throw an exception. Otherwise, we search for the matching group and create a token from it, ignoring the WHITESPACE tokens. Finally, we add an EOF token that indicates the end of the input. The result is returned as a special TokenStream object. Here is the TokenStream class that will help us do the parsing:

 private static class TokenStream { final List<Token> tokens; int offset = 0; public TokenStream(List<Token> tokens) { this.tokens = tokens; } // consume next token of given type (throw exception if type differs) public Token consume(TokenType type) throws ParseException { Token token = tokens.get(offset++); if (token.type != type) { throw new ParseException("Unexpected token at " + token.start + ": " + token + " (was looking for " + type + ")", token.start); } return token; } // consume token of given type (return null and don't advance if type differs) public Token consumeIf(TokenType type) { Token token = tokens.get(offset); if (token.type == type) { offset++; return token; } return null; } @Override public String toString() { return tokens.toString(); } } 

So, we have a tokenizer, hura. You can check it right now using System.out.println(tokenize("Acct1 = 'Y' AND (Acct2 = 'N' OR Acct3 = 'N')"));

Now write a parser that will create a tree view of our expression. First, the Expr interface for all tree nodes:

 public interface Expr { public boolean evaluate(Map<String, String> data); } 

Its the only method used to evaluate an expression for a given dataset and return true if the dataset matches.

The most basic expression is EqualsExpr , which is similar to Acct1 = 'Y' or 'Y' = Acct1 :

 private static class EqualsExpr implements Expr { private final String identifier, literal; public EqualsExpr(TokenStream stream) throws ParseException { Token token = stream.consumeIf(TokenType.IDENTIFIER); if(token != null) { this.identifier = token.data; stream.consume(TokenType.EQUALS); this.literal = stream.consume(TokenType.LITERAL).data; } else { this.literal = stream.consume(TokenType.LITERAL).data; stream.consume(TokenType.EQUALS); this.identifier = stream.consume(TokenType.IDENTIFIER).data; } } @Override public String toString() { return identifier+"='"+literal+"'"; } @Override public boolean evaluate(Map<String, String> data) { return literal.equals(data.get(identifier)); } } 

The toString() method is for information only; it can be deleted.

Next, we will define the SubExpr class, which is either EqualsExpr or something more complicated in parentheses (if we see parentheses):

 private static class SubExpr implements Expr { private final Expr child; public SubExpr(TokenStream stream) throws ParseException { if(stream.consumeIf(TokenType.LEFT_PAREN) != null) { child = new OrExpr(stream); stream.consume(TokenType.RIGHT_PAREN); } else { child = new EqualsExpr(stream); } } @Override public String toString() { return "("+child+")"; } @Override public boolean evaluate(Map<String, String> data) { return child.evaluate(data); } } 

Next is AndExpr , which is a collection of SubExpr expressions joined by an AND operator:

 private static class AndExpr implements Expr { private final List<Expr> children = new ArrayList<>(); public AndExpr(TokenStream stream) throws ParseException { do { children.add(new SubExpr(stream)); } while(stream.consumeIf(TokenType.AND) != null); } @Override public String toString() { return children.stream().map(Object::toString).collect(Collectors.joining(" AND ")); } @Override public boolean evaluate(Map<String, String> data) { for(Expr child : children) { if(!child.evaluate(data)) return false; } return true; } } 

I use the Java-8 Stream API in toString for short. If you cannot use Java-8, you can rewrite it with a for loop or remove toString completely.

Finally, we define OrExpr , which is a collection of AndExpr associated with OR (usually OR has a lower priority than AND ). It is very similar to AndExpr :

 private static class OrExpr implements Expr { private final List<Expr> children = new ArrayList<>(); public OrExpr(TokenStream stream) throws ParseException { do { children.add(new AndExpr(stream)); } while(stream.consumeIf(TokenType.OR) != null); } @Override public String toString() { return children.stream().map(Object::toString).collect(Collectors.joining(" OR ")); } @Override public boolean evaluate(Map<String, String> data) { for(Expr child : children) { if(child.evaluate(data)) return true; } return false; } } 

And the last parse method:

 public static Expr parse(TokenStream stream) throws ParseException { OrExpr expr = new OrExpr(stream); stream.consume(TokenType.EOF); // ensure that we parsed the whole input return expr; } 

So, you can parse your expressions to get Expr objects, and then evaluate them by the lines of your CSV file. I assume that you can parse a CSV string into Map<String, String> . Here is a usage example:

 Map<String, String> data = new HashMap<>(); data.put("Acct1", "Y"); data.put("Acct2", "N"); data.put("Acct3", "Y"); data.put("Acct4", "N"); Expr expr = parse(tokenize("Acct1 = 'Y' AND (Acct2 = 'Y' OR Acct3 = 'Y')")); System.out.println(expr.evaluate(data)); // true expr = parse(tokenize("Acct1 = 'N' OR 'Y' = Acct2 AND Acct3 = 'Y'")); System.out.println(expr.evaluate(data)); // false 
+5
source

I do not know how effective this will be in Java, but basic string replacement operations may be a simple solution to this.

You start with the query string:

Acct1 = 'Y' And Acct2 = 'N' And Acct3 = 'Y' And Acct4 = 'N' And Acct5 = 'N' OR ((Acct6 = 'N' OR Acct7 = 'N') And Acct8 = ' N 'and Acct9 =' Y 'AND (Acct10 =' N 'OR Acct11 =' N ') And Acct12 =' N ')

For each line in csv, for example. Y,N,Y,N,Y,N,Y,N,Y,N,Y,N string - replace column headers in the query with values; what gives you:

Y = 'Y' AND N = 'N' AND Y = 'Y' AND N = 'N' AND Y = 'N' OR ((N = 'N' OR Y = 'N') AND N = ' N 'and Y =' Y 'AND (N =' N 'OR Y =' N ') AND N =' N ')

Then replace the comparisons by their logical value:
- replace N = 'N' and Y = 'Y' with Y
- replace N = 'Y' and Y = 'N' with N

This will lead to:

Y AND Y AND Y AND Y AND N OR ((Y OR N) AND Y AND Y AND (Y OR N) AND Y)

Then do a series of string replacement operations that replace truth values โ€‹โ€‹with Y and false values โ€‹โ€‹with N :
- replace Y AND Y with Y
- replace N AND N , N AND Y and Y AND N with N
- replace Y OR Y , N OR Y and Y OR N with Y
- replace N OR N with N
- replace (N) with N
- replace (Y) with Y

This will gradually reduce the logical expression:

Y AND Y AND Y AND Y AND N OR ((Y OR N) AND Y AND Y AND (Y OR N) AND Y)
Y and Y and N OR ((Y) and Y and (Y) and Y)
Y AND N OR (Y AND Y AND Y AND Y)
N OR (Y and Y)
N OR (Y)
At

If queries contain implicit priorities without parentheses, such as N AND N OR Y AND Y , where you want AND take precedence over OR , always run out of options for replacing AND and brackets before replacing OR :

 while (string length decreases) { while (string length decreases) { replace every "(Z)" by "Z" replace every "X AND Y" by "Z" } replace one "X OR Y" by "Z" } 

During this reduction, be sure to check to see if the length of the string is reduced after each iteration to avoid endless loops caused by incorrect queries.

+3
source

Hint:

A possible solution is to store your Boolean condition values โ€‹โ€‹in a single string attribute, such as "YNYNNNNYNYNYN", or, better, packaged as a binary integer. Then, for this sentence, create a table of all accepted rows. The merge operation will return all the required records.

You can even process multiple offers in a single pass by combining the offer number with the accepted rows when creating the table.

Despite the fact that the size of the table may be exponential in terms of the number of conditions, it can remain quite manageable under a moderate number of conditions.

+1
source

You have an expression written in some language that seems compatible with the grammar of the WHERE SQL clause. Therefore you need to:

  • a parser for this language that AST can build and then an expression tree
  • mechanism for evaluating the expression tree according to your context (i.e., an environment with the names Acct1, Acct2, etc. is allowed)

This is a simple language, so you can create your own parser manually or look at ANTLR or JavaCC - in which case I suggest you take a look at some sample ( ANTLR or JavaCC ) - of course, you do not need a full SQL parser! Just extract the bits you need.

A simpler approach is to write a filter expression in any language that can be invoked through the Java scripting interface, such as Javascript or Groovy (or Ruby, Python ...). I do not suggest running find / replace in the input text to convert a SQL-like language to the target language (for example, Python has and and or operators - lowercase letters), as this breaks easily depending on the contents of the input string.

0
source

All Articles