Regular expression: match string containing only non-duplicate words

I have this situation (Java code): 1) a line such as: "Wild adventure" should match. 2) a line with adjacent repeating words: "Wild wild adventure" should not match.

With this regex :. * \ b (\ w +) \ b \ s * \ 1 \ b. * I can match strings containing adjacent repeating words.

How to change the situation ie how to match strings that do not contain adjacent repeating words

+3
source share
1 answer

Use a negative statement, (?!pattern) .

  String[] tests = { "A wild adventure", // true "A wild wild adventure" // false }; for (String test : tests) { System.out.println(test.matches("(?!.*\\b(\\w+)\\s\\1\\b).*")); } 

Explanation courtesy of Rick Measham explain.pl :

 REGEX: (?!.*\b(\w+)\s\1\b).* NODE EXPLANATION -------------------------------------------------------------------------------- (?! look ahead to see if there is not: -------------------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) -------------------------------------------------------------------------------- \b the boundary between a word char (\w) and something that is not a word char -------------------------------------------------------------------------------- ( group and capture to \1: -------------------------------------------------------------------------------- \w+ word characters (az, AZ, 0-9, _) (1 or more times (matching the most amount possible)) -------------------------------------------------------------------------------- ) end of \1 -------------------------------------------------------------------------------- \s whitespace (\n, \r, \t, \f, and " ") -------------------------------------------------------------------------------- \1 what was matched by capture \1 -------------------------------------------------------------------------------- \b the boundary between a word char (\w) and something that is not a word char -------------------------------------------------------------------------------- ) end of look-ahead -------------------------------------------------------------------------------- .* any character except \n (0 or more times (matching the most amount possible)) 

see also

Related Questions


Note

Negative statements make sense only when there are other patterns that you want to positively match (see examples above). Otherwise, you can simply use the logical complement operator ! to negate matches with the same pattern you used before.

 String[] tests = { "A wild adventure", // true "A wild wild adventure" // false }; for (String test : tests) { System.out.println(!test.matches(".*\\b(\\w+)\\s\\1\\b.*")); } 
+6
source

Source: https://habr.com/ru/post/1314753/


All Articles