Is there a regular expression method for replacing a character set with another one (for example, with the tr tr command)?

The tr shell replaces one character set with another character set. For example, echo hello | tr [az] [AZ] echo hello | tr [az] [AZ] will broadcast hello to hello .

In java, however, I have to replace each character separately, as shown below.

 "10 Dogs Are Racing" .replaceAll ("0", "0") .replaceAll ("1", "1") .replaceAll ("2", "2") // ... .replaceAll ("9", "9") .replaceAll ("A", "A") // ... ; 

The apache-commons-lang library provides a convenient replaceChars method for such a replacement.

 // half-width to full-width System.out.println ( org.apache.commons.lang.StringUtils.replaceChars ( "10 Dogs Are Racing", "0123456789ABCDEFEGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz", "0123456789ABCDEFEGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" ) ); // Result: // 10 Dogs Are Racing 

But, as you can see, once the searchCards / replaceChars are too long (also too boring if you want, duplicate the character) and can be expressed with a simple regular expression [0-9A-Za-z] / [0-9A-Za-z] Is there a regular expression way to achieve this?

+8
java regex replace
source share
2 answers

While there is no direct way to do this, creating your own utility function for use in conjunction with replaceChars relatively simple. In the version below, simple character classes are allowed without [ or ] ; it does not perform class negation ( [^az] ).

In your use case, you can:

 StringUtils.replaceChars(str, charRange("0-9A-Za-z"), charRange("0-9A-Za-z")) 

the code:

 public static String charRange(String str) { StringBuilder ret = new StringBuilder(); char ch; for(int index = 0; index < str.length(); index++) { ch = str.charAt(index); if(ch == '\\') { if(index + 1 >= str.length()) { throw new PatternSyntaxException( "Malformed escape sequence.", str, index ); } // special case for escape character, consume next char: index++; ch = str.charAt(index); } if(index + 1 >= str.length() || str.charAt(index + 1) != '-') { // this was a single char, or the last char in the string ret.append(ch); } else { if(index + 2 >= str.length()) { throw new PatternSyntaxException( "Malformed character range.", str, index + 1 ); } // this char was the beginning of a range for(char r = ch; r <= str.charAt(index + 2); r++) { ret.append(r); } index = index + 2; } } return ret.toString(); } 

It produces:

 0-9A-Za-z : 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz 0-9A-Za-z : 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz 
+5
source share

Not.

(some extra characters, so SO will let me post my other short answer)

+5
source share

All Articles