Remove WhiteSpace characters from String instance

is there any other way to remove WhiteSpace Char (s) from String

1) other as i know

myString.trim()

Pattern.compile("\\s");

2) is there any other reason to look for / look for another / different method when using

+5
source share
6 answers

Guava has been preconfigured CharMatcherfor whitespace(). It also works with Unicode.

Using an example:

System.out.println(CharMatcher.whitespace().removeFrom("H \ne\tl\u200al \to   "));

Conclusion:

Hello

CharMatcher also has many other nice features, one of my favorites is a method collapseFrom()that replaces multiple occurrences with one character:

System.out.println(
    CharMatcher.whitespace().collapseFrom("H \ne\tl\u200al \to   ", '*'));

Conclusion:

Hello *

+13
source

myString.replaceAll("\\s", ""). :

  • unicode
  • . , , .
+7

Trim ASCII 0 ASCII 32. , ASCII, . .

for(int i=Character.MIN_CODE_POINT;i<=Character.MAX_CODE_POINT;i++)
  if(Character.isWhitespace(i))
    System.out.println(i);

9 10 11 12 13 28 29 30 31 32 5760 6158 8192 8193 8194 8195 8196 8197 8198 8200 8201 8202 8232 8233 8287 12288

+3

, , . , trim() . , - :

s = s.replaceAll("^\\s+|\\s+$", "");

, . Pre-Java 7, \s ASCII, ..

"[\\u0009\\u000A\\u000B\\u000C\\u000D\\u0020]"

... while ( ) trim() 32 (U+0020 Unicode). , , , , , , . ( , .) , . , trim() regex:

String s = "\u0000\u0001\u0002\u0003\u0004\u0005\u0006\u0007"
         + "\u0008\u0009\n\u000B\u000C\r\u000E\u000F"
         + "\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017"
         + "\u0018\u0019\u001A\u001B\u001C\u001D\u001E\u001F"
         + "\u0020\u00A0";
System.out.println(s.length());
System.out.println(s.trim().length());
System.out.println(s.replaceAll("\\s", "").length());

:

34
1
28

- (U+00A0 "NBSP" ). ASCII, , , , , , NBSP. trim(), , , , :

System.out.println(s.replaceAll("(?U)\\s", "").length());

... Java 7:

34
1
27

(?U), UNICODE_CHARACTER_CLASSES, @tchrist . NBSP - , , Character.isWhitespace(), , . Guava ( @Sean) BREAKING_WHITESPACE CharMatcher.

, , , , , . , , , , trim() StringTokenizer, , .

+3

# Java - XmlNode.OuterXml XmlNode.InnerXml. Transformer, - , . , , postprocess, , , :

string.replaceAll("[\t\n\b\r\f]+ *", "");   
string.replaceAll("[\\s+ *", "");

both of them remove any spaces in the line and in the tab spaces. Hope this is at least a little relevant. The second option is the best choice.

+1
source

String.replace ("," ");

(2) it is possible to tune performance, besides this, I do not know

0
source

All Articles