I asked about punctuation and regular expression, but that was not clear.
I believe that I have this text:
String text = "wor.d1, :word2. wo,rd3? word4!";
I'm doing it:
String parts[] = text.split(" ");
And I have this:
wor.d1, | :word2. | wor,d3? | word4!;
What do I need to do to have this? (keep the characters on the borders, but only I specify: .,!?: , not all).
wor,d1 | , | : | word2 | . | wor,d3 | ? | word4 | !
UPDATE
I get good results with this regex, but it gives an empty char before everything splits into punctuation at the beginning of the word.
Is there a way to not have this empty char at the beginning?
Is this regular expression good, or is there an easier way?
public static final String PUNCTUATION_SEPARATOR = "(" + "(" + "(?=^[\"'!?.,;:(){}\\[\\]]+)" + "|" + "(?<=^[\"'!?.,;:(){}\\[\\]]+)" + ")" + "|" + "(" + "(?=[\"'!?.,;:(){}\\[\\]]+($|\n))" + "|" + "(?<=[\"'!?.,;:(){}\\[\\]]+($|\n))" + ")" + ")";
Renato dinhani
source share