R regex removes all punctuation except the apostrophe

I am trying to remove all punctuation from a string except apostrophes. Here is my exastr2 <-

str2 <- "this doesn't not have an apostrophe, .!@ #$%^&*()" gsub("[[:punct:,^\\']]"," ", str2 ) # [1] "this doesn't not have an apostrophe, .!@ #$%^&*()" 

What am I doing wrong?

+4
source share
3 answers

A “negative statement of expectation” can be used to remove any apostrophes from consideration before they are even verified to be punctuation characters.

 gsub("(?!')[[:punct:]]", "", str2, perl=TRUE) # [1] "this doesn't not have an apostrophe" 
+13
source

I am not sure that you can specify all the preludes except ' in the regular expression as you did. I would look at alphanumerics + ' + space with negation:

 gsub("[^'[:lower:] ]", "", str2) # per Joshua comment # [1] "this doesn't not have an apostrophe" 
+1
source

You can use:

 str2 <- "this doesn't not have an apostrophe, .!@ #$%^&*()" library(qdap) strip(str2, apostrophe.remove = FALSE, lower.case = FALSE) 
+1
source

All Articles