Do I always need to avoid metacharacters in a string that is not a "literal"?

It seems that a string containing the characters { or } is discarded during regular expression processing. I can understand that these are reserved characters, and I need to avoid them, so if I do this:

 string.replaceAll("\\" + pattern); 

This works where pattern is any line starting with { .

Question: Is there a way to avoid such problems with strings that already contain such metamarks so that they are processed automatically? It seems to me that this should be the same as adding a double quote to the string literal and accepting the string as input, which already has a double quote

+7
java string regex
source share
4 answers

Use Pattern.quote(String) :

 public static String quote(String s) 

Returns the literal String pattern for the specified String .

This method creates a String that can be used to create a Pattern that matches the string s , as if it were an alphabetic pattern.

Metacharacters or escape sequences in the input sequence will not have special meaning.

Options:
s - String to be literalized
Return:
literal string replacement
FROM
1.5

+8
source share

you can use

 java.util.regex.Pattern.quote(java.lang.String) 

to avoid the metacharacters used by regular expressions.

+4
source share

TL; DR

  • if you need regex use replaceAll or replaceFirst ,
  • If you want your target/replacement pair to be treated as literals, use replace (it also replaces the appearance of all your targets).

Most people confuse the bad name for replacement methods in the String class, which:

  • replaceAll(String, String)
  • replaceFirst(String, String)
  • replace(CharSequence, CharSequence)
  • replace(char, char)

Since the replaceAll method explicitly states that it replaces all possible goals, people assume that the replace method does not guarantee this behavior, since it does not contain the suffix All .
But this assumption is incorrect.

The main difference between these methods is shown in this table:

 ╔═════════════════════╦═══════════════════════════════════════════════════════════════════╗ β•‘ β•‘ replaced targets β•‘ β•‘ ╠════════════════════════════════════╦══════════════════════════════╣ β•‘ β•‘ ALL found β•‘ ONLY FIRST found β•‘ ╠══════╦══════════════╬════════════════════════════════════╬══════════════════════════════╣ β•‘ β•‘ supported β•‘ replaceAll(String, String) β•‘ replaceFirst(String, String) β•‘ β•‘regex ╠══════════════╬════════════════════════════════════╬══════════════════════════════╣ β•‘syntaxβ•‘ not β•‘ replace(CharSequence, CharSequence)β•‘ \/ β•‘ β•‘ β•‘ supported β•‘ replace(char, char) β•‘ /\ β•‘ β•šβ•β•β•β•β•β•β•©β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•©β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•©β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• 

Now, if you do not need to use the regex syntax method, which does not expect it, but it treats target and replacement as literals.

So, instead of replaceAll(regex, replacement)

use replace(literal, replacement) .


As you can see, there are two overloaded versions of replace . Both of these should work for you, as they do not support regex syntax. The main difference between the two is as follows:

  • replace(char target, char replacement) just creates a new line and fills it with either the character from the source string or the character you decide to replace (depending on whether it was equal to the target character)

  • replace(CharSequence target, CharSequence replacement) is essentially equivalent to replaceAll(Pattern.quote(target), Matcher.quoteReplacement(replacement.toString()) , which means it is the same as replaceAll , but (which means it is internally uses the regex mechanism), but it automatically executes the regular expression metacharacters used in target and replacement for

+3
source share

You do not need additional code, just constructs \Q and \E , as described in the Java Pattern class .

For example, in the following code:

 String foobar = "crazyPassword=f() ob@r {}+"; Pattern regex = Pattern.compile("\\Q" + foobar "\\E"); 

the pattern will compile, and special foobar characters will not be interpreted as regular expression characters. See Demo here .

The only thing that won't match is where the input contains the literal \E If you need to solve this problem too, just let me know in the comments and I will edit to add this.

0
source share

All Articles