Regular expression with & as a delimiter

I was given a long text in which I need to find all the text embedded in the & pair (for example, in the text "&hello&&bye&" I need to find the words "hello" and "bye" ).

I try to use the regular expression ".*&([^&])*&.*" , But it doesn’t work, I don’t know what’s wrong with that.

Any help?

thanks

+4
source share
6 answers

Try this way

 String data = "&hello&&bye&"; Matcher m = Pattern.compile("&([^&]*)&").matcher(data); while (m.find()) System.out.println(m.group(1)); 

output:

 hello bye 
+6
source

No need for regular expression. Just repeat it!

 boolean started = false; List<String> list; int startIndex; for(int i = 0; i < string.length(); ++i){ if(string.charAt(i) != '&') continue; if(!started) { started = true; startIndex = i + 1; } else { list.add(string.substring(startIndex, i)); // maybe some +-1 here in indices } started = !started; } 

or use split!

 String[] parts = string.split("&"); for(int i = 1; i < parts.length; i += 2) { // every second list.add(parts[i]); } 
+2
source

If you do not want to use regular expressions, here is an easy way.

 String string = "xyz...." // the string containing "hello", "bye" etc. String[] tokens = string.split("&"); // this will split the string into an array // containing tokens separated by "&" for(int i=0; i<tokens.length; i++) { String token = tokens[i]; if(token.length() > 0) { // handle edge case if(i==tokens.length-1) { if(string.charAt(string.length()-1) == '&') System.out.println(token); } else { System.out.println(token); } } } 
+2
source

Two problems:

  • You repeat the capture group. This means that you will only catch the last letter between & in the group.

  • You will only match the last word, because .* Will gobble up the rest of the line.

Use images :

 (?<=&)[^&]+(?=&) 

Now the whole match will be hello (and bye when re-applying the regular expression a second time), because the surrounding & will no longer be part of the match:

 List<String> matchList = new ArrayList<String>(); Pattern regex = Pattern.compile("(?<=&)[^&]+(?=&)"); Matcher regexMatcher = regex.matcher(subjectString); while (regexMatcher.find()) { matchList.add(regexMatcher.group()); } 
0
source

Surrounding .* Are meaningless and unproductive. Just &([^&])*& enough.

0
source

I would simplify this even further.

  • Make sure the first char is &
  • Make sure the last char is &
  • String.split("&&") in a substring between them

In code:

 if (string.length < 2) throw new IllegalArgumentException(string); // or return[], whatever if ( (string.charAt(0) != '&') || (string.charAt(string.length()-1) != '&') // handle this, too String inner = string.substring(1, string.length()-1); return inner.split("&&"); 
0
source

All Articles