Re-expression regular expression

I can easily write down repeated words using:, "(?i)\\b(\\w+)(((\\.{3}\\s*|,\\s+)*|\\s+)\\1)+\\b"but this regular expression does not seem to apply to mutipe words (and why it should be in the current state). How to find duplicate phrases using regular expression?

Here I am extracting duplicate terms (regardless of the case), but the same regular expression does not contain a word to extract a repeating phrase:

library(qdapRegex)
rm_default("this is a big Big deal", pattern = "(?i)\\b(\\w+)(((\\.{3}\\s*|,\\s+)*|\\s+)\\1)+\\b", extract=TRUE)
rm_default("this is a big is a Big deal", pattern = "(?i)\\b(\\w+)(((\\.{3}\\s*|,\\s+)*|\\s+)\\1)+\\b", extract=TRUE)

I hope for a regex that will return:

"is a big is a Big"

for

x <- "this is a big is a Big deal"

To cover corner cases here, a larger desired test and conclusion is required ...

    "this is a big is a Big deal",
    "I want want to see",
    "I want, want to see",
    "I want...want to see see how",
    "this is a big is a Big deal for those of, those of you who are.",
    "I like it. It is cool",
)


[[1]]
[1] "is a big is a Big"

[[2]]
[1] "want want"

[[3]]
[1] "want, want"

[[4]]
[1] "want...want" "see see"    

[[5]]
[1] "is a big is a Big" "those of, those of"

[[6]]
[1] NA

My current regex only allows me:

rm_default(y, pattern = "(?i)\\b(\\w+)(((\\.{3}\\s*|,\\s+)*|\\s+)\\1)+\\b", extract=TRUE)

## [[1]]
## [1] NA
## 
## [[2]]
## [1] "want want"
## 
## [[3]]
## [1] "want, want"
## 
## [[4]]
## [1] "want...want" "see see"    
## 
## [[5]]
## [1] NA
+4
source share
2 answers

, , ( , , ... , , ):

pattern <- "(?i)\\b(\\w.*)((?:\\s|\\.{3}|,)+\\1)+\\b"
rm_default(x, pattern = pattern, extract=TRUE)

:

[[1]]
[1] "is a big is a Big"

[[2]]
[1] "want want"

[[3]]
[1] "want, want"

[[4]]
[1] "want...want" "see see"    

[[5]]
[1] "is a big is a Big"  "those of, those of"
+2

:

> regmatches(x, gregexpr("(?i)\\b(\\S.*\\S)[ ,.]*\\b(\\1)", x, perl = TRUE))
[[1]]
[1] "is a big is a Big"

[[2]]
[1] "want want"

[[3]]
[1] "want, want"

[[4]]
[1] "want...want" "see see"    

[[5]]
[1] "is a big is a Big"  "those of, those of"

( - \S .

(?i)\b(\S.*\S)[ ,.]*\b(\1)

Regular expression visualization

Debuggex

, [ ,.] [ [:punct:]]. , debuggex POSIX.

+1

All Articles