Regular expressions in java with variable

I have a variable v that may appear more than once in a row on a line. I want to make all consecutive vs turn into one v. For instance:

String s = "Hello, world!"; String v = "l"; 

A regular expression would say "Hello world!" in "Hello world!"

So I want to do something like

 s = s.replaceAll(vv+, v) 

But obviously this will not work. Thoughts?

+4
source share
5 answers

You need to combine the two lines of "v".

Try s = s.replaceAll(v + v + "+", v)

+4
source

Let iteratively develop a solution; at each step we indicate what problems are there and fix them until we come to a final answer.

We can start with something like this:

 String s = "What???? Impo$$ible!!!"; String v = "!"; s = s.replaceAll(v + "{2,}", v); System.out.println(s); // "What???? Impo$$ible!" 

{2,} is the regular expression syntax for the final repetition, which means "at least 2 of" in this case.

It so happened that it works because ! is not a regular expression metacharacter. Let's see what happens if we try the following:

 String v = "?"; s = s.replaceAll(v + "{2,}", v); // Exception in thread "main" java.util.regex.PatternSyntaxException: // Dangling meta character '?' 

One way to fix the problem is to use Pattern.quote so that v taken literally:

 s = s.replaceAll(Pattern.quote(v) + "{2,}", v); System.out.println(s); // "What? Impo$$ible!!!" 

It turns out that this is not the only thing we need to worry about: in replacement strings, \ and $ are also special metacharacters. This explains why we get the following problem:

 String v = "$"; s = s.replaceAll(Pattern.quote(v) + "{2,}", v); // Exception in thread "main" java.lang.StringIndexOutOfBoundsException: // String index out of range: 1 

Since we want v be taken literally as a replacement string, we use Matcher.quoteReplacement as follows:

 s = s.replaceAll(Pattern.quote(v) + "{2,}", Matcher.quoteReplacement(v)); System.out.println(s); // "What???? Impo$ible!!!" 

Finally, repetition takes precedence over concatenation. This means the following:

 System.out.println( "hahaha".matches("ha{3}") ); // false System.out.println( "haaa".matches("ha{3}") ); // true System.out.println( "hahaha".matches("(ha){3}") ); // true 

So, if v can contain multiple characters, you should group them before applying repetition. In this case, you can use the group without capturing, since you do not need to create a backlink.

 String s = "well, well, well, look who here..."; String v = "well, "; s = s.replaceAll("(?:" +Pattern.quote(v)+ "){2,}", Matcher.quoteReplacement(v)); System.out.println(s); // "well, look who here..." 

Summary

  • To match an arbitrary literal string that may contain regular expression metacharacters, use Pattern.quote
  • To replace an arbitrary literal string that may contain replacement metacharacters, use Matcher.quoteReplacement

References


Bonus Material

The following example uses reluctant repetition, capturing group and backlinks mixed with case insensitivity:

  System.out.println( "omgomgOMGOMG???? Yes we can! YES WE CAN! GOAAALLLL!!!!" .replaceAll("(?i)(.+?)\\1+", "$1") ); // "omg? Yes we can! GOAL!" 

Related Questions

References

+17
source

Use x{2,} to match x at least two times.

To be able to replace characters with special values ​​for regular expressions, you should use Pattern.quote :

 String part = Pattern.quote(v); s = s.replaceAll(part + "{2,}", v); 

To replace things longer than one character, use groups that are not exciting:

 String part = "(?:" + Pattern.quote(v) + ")"; s = s.replaceAll(part + "{2,}", v); 
+5
source

With regex in Java, be sure to use Pattern.quote and Matcher.quoteReplacement :

 package com.example.test; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Regex2 { static public void main(String[] args) { String s = "Hello, world!"; String v = "l"; System.out.println(doit(s,v)); s = "Test: ??r??r Solo ??r Frankenstein!"; v = "??r"; System.out.println(doit(s,v)); } private static String doit(String s, String v) { Pattern p = Pattern.compile("(?:"+Pattern.quote(v)+"){2,}"); Matcher m = p.matcher(s); StringBuffer sb = new StringBuffer(); while (m.find()) { m.appendReplacement(sb, Matcher.quoteReplacement(v)); } m.appendTail(sb); return sb.toString(); } } 
+3
source
 s = s.replaceAll (v + "+", v) 
+2
source

Source: https://habr.com/ru/post/1314834/


All Articles