What you are looking for is the tokenization of the sentence, and it is not as straightforward as it seems, even in English (sentences such as “I met Dr. Bennett, Mrs. Yohon’s ex-husband” may contain complete stops).
R is definitely not the best choice for natural language processing. If you are experienced Python , I suggest you take a look at nltk , which covers this and many other topics. You can also copy the code from this blog post , which performs sentence tokenization and word tokenization.
If you want to stick to R, I would suggest that you count the characters at the end of the sentence ( . , ? , ! ), Since you can count the characters. The way to do this with regex is:
text <- 'Hello world!! Here are two sentences for you...' length(gregexpr('[[:alnum:] ][.!?]', text)[[1]])
source share