Regex: Capturing one or more groups if exists (Java)

I want to capture groups matching a pattern where an input can contain this group one or more times.

Example:

input = 12361 randomstuff371 12 Mar 16 138more random381 stuff73f 

I want to capture March 12th.

From this, I easily used the regex:

 pattern = (".*(\\d{2}\\s\\w+\\s\\d{2}).*"); 

However, my problem is that when an input can contain more than one of these groups, I cannot fix subsequent matches.

Example:

 input = randomstuff371 12 Mar 16 14 Jan 15 13 Feb 16 138more random381 stuff73f 

Thus:

 group 1 = 12 Mar 16 group 2 = 14 Jan 15 group 3 = 13 Feb 16 

The number of these groups that will correspond will always change, and so I wonder if there is a regular expression that will work on inputs that contain 1 or more of these groups. I tried:

 pattern = (".*(\\d{2}\\s\\w+\\s\\d{2}\\s)+.*"); \\ Not sure about whitespace at the end 

However, this does not work. Is it more related to how I store these captured groups? I cannot determine the number of groups that I will need, especially since the regular expression should work on many of these inputs.

It seems to me that I just better grab the entire date segment and process it later matcher.find() to count the number of groups that I need.

Any help would be greatly appreciated.

+8
java regex regex-group
source share
1 answer

It will be easier for you to simply map a specific pattern and get substrings in the form of several matches obtained using Matcher#find() :

 String s = "randomstuff371 12 Mar 16 14 Jan 15 13 Feb 16 138more random381 stuff73f"; Pattern pattern = Pattern.compile("\\b\\d{2}\\s\\w+\\s\\d{2}\\b"); Matcher matcher = pattern.matcher(s); while (matcher.find()){ System.out.println(matcher.group(0)); } 

See the online Java demo and the regex demo .

I added word boundaries to the template to make sure the template matches as a whole word, but they can be omitted if your substrings are glued to another text.

+13
source share

All Articles