I have a large text containing the following expressions: "aaaahahahahaha that was a good joke".after processing, I want it to "aaaaahahahaha"disappear or at least change it to simple "ha".
At the moment I am using this:
gsub('(.+?)\\1', '', str)
This works when the template line is at the beginning of the sentence, but not where it is somewhere else. So:
str <- "aaaahahahahaha that was a good joke"
gsub('(.+?)\\1', '', str)
But
str <- "that was aaaahahahahaha a good joke"
gsub('(.+?)\\1', '', str)
This question may be related to this: find duplicate pattern in python , but I cannot find equivalence in R.
, , , - , , , , , - . : R?
.