Replace multiple substrings at once

Say I have a file containing some text. It has substrings such as "substr1", "substr2", "substr3", etc. I need to replace all these substrings with another text, for example, "repl1", "repl2", "repl3". In Python, I would create a dictionary like this:

{ "substr1": "repl1", "substr2": "repl2", "substr3": "repl3" } 

and create a pattern connecting keys with '|', then replace it with re.sub . Is there an easy way to do this in Java?

+8
java regex replace
source share
5 answers

Here's how your Python sentence translates to Java:

 Map<String, String> replacements = new HashMap<String, String>() {{ put("substr1", "repl1"); put("substr2", "repl2"); put("substr3", "repl3"); }}; String input = "lorem substr1 ipsum substr2 dolor substr3 amet"; // create the pattern joining the keys with '|' String regexp = "substr1|substr2|substr3"; StringBuffer sb = new StringBuffer(); Pattern p = Pattern.compile(regexp); Matcher m = p.matcher(input); while (m.find()) m.appendReplacement(sb, replacements.get(m.group())); m.appendTail(sb); System.out.println(sb.toString()); // lorem repl1 ipsum repl2 dolor repl3 amet 

This approach replaces the simultaneous (ie, "immediately"). If you encounter

 "a" -> "b" "b" -> "c" 

then this approach would give "ab" -> "bc" as opposed to answers offering you chains of several replace or replaceAll calls that would give "cc" .


(If you generalize this approach to creating a regular expression programmatically, make sure that Pattern.quote each individual search word and Matcher.quoteReplacement each Matcher.quoteReplacement word.)

+14
source share

StringUtils.replaceEach in the Apache Commons Lang project, but it works on Strings.

+6
source share
 yourString.replace("substr1", "repl1") .replace("substr2", "repl2") .replace("substr3", "repl3"); 
+2
source share

First, a demonstration of the problem:

 String s = "I have three cats and two dogs."; s = s.replace("cats", "dogs") .replace("dogs", "budgies"); System.out.println(s); 

This is intended to replace cats => dogs and dogs => budgets, but a sequential change works on the result of the previous replacement, so the unsuccessful conclusion:

I have three budgies and two budgies.

Here is my implementation of the simultaneous replacement method. Easy to write with String.regionMatches :

 public static String simultaneousReplace(String subject, String... pairs) { if (pairs.length % 2 != 0) throw new IllegalArgumentException( "Strings to find and replace are not paired."); StringBuilder sb = new StringBuilder(); int numPairs = pairs.length / 2; outer: for (int i = 0; i < subject.length(); i++) { for (int j = 0; j < numPairs; j++) { String find = pairs[j * 2]; if (subject.regionMatches(i, find, 0, find.length())) { sb.append(pairs[j * 2 + 1]); i += find.length() - 1; continue outer; } } sb.append(subject.charAt(i)); } return sb.toString(); } 

Testing:

 String s = "I have three cats and two dogs."; s = simultaneousReplace(s, "cats", "dogs", "dogs", "budgies"); System.out.println(s); 

Output:

I have three dogs and two friends.

In addition, it is sometimes useful while replacing at the same time to make sure that you are looking for the longest match. (The PHP strtr function does this, for example.) Here is my implementation for this:

 public static String simultaneousReplaceLongest(String subject, String... pairs) { if (pairs.length % 2 != 0) throw new IllegalArgumentException( "Strings to find and replace are not paired."); StringBuilder sb = new StringBuilder(); int numPairs = pairs.length / 2; for (int i = 0; i < subject.length(); i++) { int longestMatchIndex = -1; int longestMatchLength = -1; for (int j = 0; j < numPairs; j++) { String find = pairs[j * 2]; if (subject.regionMatches(i, find, 0, find.length())) { if (find.length() > longestMatchLength) { longestMatchIndex = j; longestMatchLength = find.length(); } } } if (longestMatchIndex >= 0) { sb.append(pairs[longestMatchIndex * 2 + 1]); i += longestMatchLength - 1; } else { sb.append(subject.charAt(i)); } } return sb.toString(); } 

Why do you need this? Example:

 String truth = "Java is to JavaScript"; truth += " as " + simultaneousReplaceLongest(truth, "Java", "Ham", "JavaScript", "Hamster"); System.out.println(truth); 

Output:

Java is JavaScript because Ham is Hamster

If we used simultaneousReplace instead of simultaneousReplaceLongest , the output would be "HamScript" instead of "Hamster :)

Please note that the above methods are case sensitive. If you want case-insensitive versions, this is easy to change, because String.regionMatches can accept the ignoreCase parameter.

+1
source share
  return yourString.replaceAll("substr1","relp1"). replaceAll("substr2","relp2"). replaceAll("substr3","relp3") 
-one
source share

All Articles