Java Regex: string matching between two colons

I am trying to write a Java regex that will find all lines between 2 :. If a line between characters has spaces, line ends, or tabs, it should be ignored. Blank lines are also ignored. _OK! A group may include either included :or not.

Here are some tests and expected groups:

"test :candidate: test" => ":candidate:"
"test :candidate: test:" => ":candidate:"
"test :candidate:_test:" => ":candidate:", ":_test:"
"test :candidate::test" => ":candidate:"
"test ::candidate: test" => ":candidate:"
"test :candidate_: :candidate: test" => ":candidate_:", ":candidate:"
"test :candidate_:candidate: test" => ":candidate_:", ":candidate:"

I tested a lot of regular expressions and they almost work:

":(\\w+):"
":[^:]+:"

I still have a problem when two groups "divide" the colon:

"test :candidate_: :candidate: test" => ":candidate_:", ":candidate:" // OK
"test :candidate_:candidate: test" => ":candidate_:" // ERROR! :(

It seems that the first group “consumes” the second colon and that the matches cannot find the second line that I expected.

- , ? , "" ?

.

+4
3

Positive Lookahead , .

(?=(:\\w+:))

. , #1 (Live Demo)

+5

String.split()?

String invalidChars = " |\t|\r|\f|\n"; // regex for invalid characters

String testStr = "test :candidate:_test:";
String[] parts = testStr.Split(":");
List<String> results = new ArrayList<String>();
for (String part : parts)
{
    if (part.matches(invalidChars) || part.isEmpty()) continue;
    results.add(part);
}

results candidate _test.

+4

, :

String[] terms = input.replaceAll("(?s)^.*?:|:[^:]*$", "").split("(?s):([^:]*\\s[^:]*:)?");

, :

  • ( / )
  • ,
  • the dotall flag (?s)does work on multiple lines

Here are some test codes:

String[] inputs =  {
        "foo:target1:bar",
        "foo:target1:target2:bar",
        "foo:target1:target2:target3:bar",
        "foo:target1:junk junk:target2:bar" ,
};
for (String input : inputs) {
    String[] terms = input.replaceAll("(?s)^.*?:|:[^:]*$", "").split("(?s):([^:]*\\s[^:]*:)?");
    System.out.println(Arrays.toString(terms));
}

Conclusion:

[target1]
[target1, target2]
[target1, target2, target3]
[target1, target2]
+1
source

All Articles