How to remove control characters from CharSequence?

I have CharSequence source, int start, int end

I would like to remove all “control characters” from the source between the beginning and the end and return this as a new CharSequence

using the "control character" I mean unwanted characters like Tab and Return, line channels, etc .... basically everything that was in ASCII <32 (space) ... but I don’t know how to do it in this "modern age"

what char? is it unicode? How to remove these "control characters"?

+5
source share
4 answers

Assuming you can get the whole source in memory, you can do this:

String tmp = source.toString();
String prefix = tmp.substring(0, start-1);
String suffix = tmp.substring(end+1);
String middle = tmp.substring(start, end).replaceAll("\\s", "");
CharSequence res = prefix + middle + suffix;
+1
source

CharSequence.subSequence(int, int) String.replaceAll(String, String) :

source.subSequence(0, start).toString() + source.subSequence(start, end).toString().replaceAll("\\p{Cntrl}", "") + source.subSequence(end, source.length()).toString()
+2

Use Character.isISOControl(char)if using the latest version of the Guava library.
Yes, char is Unicode.

+1
source

Using Guava CharMatcher:

return CharMatcher.JAVA_ISO_CONTROL.removeFrom(string);
+1
source

All Articles