I am trying to split a string with multiple sentences into an array of strings from separate sentences.
That's what I still have
String input = "Hello World. " + "Today in the USA, it is a nice day! " + "Hurrah!" + "Here it comes... " + "Party time!"; String array[] = input.split("(?<=[.?!])\\s+(?=[\\D\\d])");
And this code works fine. I get,
Hello World. Today in the USA, it is a nice day! Hurrah! Here it comes... Party time!
I use lookbehind functionality to see if the punctuation prefix ends with one or one white space(s) . If so, we will split up.
But there are some exceptions that this regular expression does not cover. For example, The US is a great country , is incorrectly split as The US and is a great country .
Any idea on how I can fix this?
And also, are there any missing edge cases here?
source share