How to find characters common to two strings in Java using single replaceAll?

So, suppose I have:

String s = "1479K"; String t = "459LP"; 

and i want to come back

 String commonChars = "49"; 

common characters between two lines.

Obviously, this can be done with a standard loop, for example:

 String commonChars = ""; for (i = 0; i < s.length; i++) { char ch = s.charAt(i); if (t.indexOf(ch) != -1) { commonChars = commonChars + ch; } } 

However, I would like to be able to do this on a single line using replaceAll . This can be done as follows:

 String commonChars = s.replaceAll("["+s.replaceAll("["+t+"]","")+"]",""); 

My question is: is it possible to do this with a single call to replaceAll ? And what will the regular expression be? I suppose I need to use some kind of look, but my brain turns into a mess when I even think about it.

+6
java optimization string regex
source share
4 answers
 String commonChars = s.replaceAll("[^"+t+"]",""); 

Note that you may need to avoid special characters in t , for example. using Pattern.quote(t) instead of t above.

+4
source share

Accepted answer:

 String commonChars = s.replaceAll("[^"+t+"]",""); 

has an error !!!

What if string t has a regular expression metacharacter? In this case, replaceAll fails.

See this program for an example, where the string t has ] in it and ] is a regular expression metacharacter that marks the end of the character class. Obviously, the program does not display the expected result.

Why?

Consider:

 String s = "1479K"; String t = "459LP]"; 

Now the regex will become (just replace t ):

 String commonChars = s.replaceAll("[^459LP]]",""); 

Which says to replace any character other than 4 , 5 , 9 , L , P , followed by ] with nothing. This is clearly not what you want.

To fix this, you need to avoid ] in t . You can do it manually as:

 String t = "459LP\\]"; 

and regex works fine .

This is a common problem when using regex, so the java.util.regex.Pattern class provides a static method called quote that can be used for this: quote regular expression metacharacters so that they are processed literally.

So before using t in replaceAll you specify it as:

 t = Pattern.quote(t); 

A program using the quotation method works as expected.

+4
source share

The accepted answer is incorrect. Since replaceAll is a template, we should consider the syntax. What happens if s1 = "\\t" ? And what happens if s1 = "]{" ?

If all characters are in the range [0 - 255], we can work as follows:

  • byte[] tmp = new byte[255];
  • encode each char in the first line

    for (char c : str1.toCharArray())
    // or use charAt(i) here if (tmp[c] == 0) tmp[c] = 1;

  • loop each char in the second line

    for (char c : str2.toCharArray()) if (tmp[c] == 1) tmp[c] = 2;

  • encode the tmp array, find the members with a value of 2, the index is the correct char that we are looking for.

Another solution uses HashSet.retainAll(Collection<?> c);

+2
source share
 public class common { public static void main(String args[]) { String s = "FIRST"; String s1 = "SECOND"; String common = s.replaceAll("[^" + s1 + "]", ""); System.out.println(common); } } 
+1
source share

All Articles