Using `strsplit` and `grep`, first I set made an object `para` which was your paragraph. toMatch <- c("Martin Luther", "Paul", "Melanchthon") unlist(strsplit(para,split="\\."))[grep(paste(toMatch, collapse="|"),unlist(strsplit(para,split="\\.")))] > unlist(strsplit(para,split="\\."))[grep(paste(toMatch, collapse="|"),unlist(strsplit(para,split="\\.")))] [1] "Opposed as a reformer at Tübingen, he accepted a call to the University of Wittenberg by Martin Luther, recommended by his great-uncle Johann Reuchlin" [2] " Melanchthon became professor of the Greek language in Wittenberg at the age of 21" [3] " He studied the Scripture, especially of Paul, and Evangelical doctrine" [4] " Johann Eck having attacked his views, Melanchthon replied based on the authority of Scripture in his Defensio contra Johannem Eckium"
Or a little cleaner:
sentences<-unlist(strsplit(para,split="\\.")) sentences[grep(paste(toMatch, collapse="|"),sentences)]
If you are looking for offers in which each person is in the form of separate returns, follow these steps:
toMatch <- c("Martin Luther", "Paul", "Melanchthon") sentences<-unlist(strsplit(para,split="\\.")) foo<-function(Match){sentences[grep(Match,sentences)]} lapply(toMatch,foo) [[1]] [1] "Opposed as a reformer at Tübingen, he accepted a call to the University of Wittenberg by Martin Luther, recommended by his great-uncle Johann Reuchlin" [[2]] [1] " He studied the Scripture, especially of Paul, and Evangelical doctrine" [[3]] [1] " Melanchthon became professor of the Greek language in Wittenberg at the age of 21" [2] " Johann Eck having attacked his views, Melanchthon replied based on the authority of Scripture in his Defensio contra Johannem Eckium"
Edit 3: To add each person’s name, do something simple, for example:
foo<-function(Match){c(Match,sentences[grep(Match,sentences)])}
EDIT 4:
And if you want to find sentences with several people / places / things (words), just add an argument for two such as:
toMatch <- c("Martin Luther", "Paul", "Melanchthon","(?=.*Melanchthon)(?=.*Scripture)")
and change perl to TRUE :
foo<-function(Match){c(Match,sentences[grep(Match,sentences,perl = T)])} > lapply(toMatch,foo) [[1]] [1] "Martin Luther" [2] "Opposed as a reformer at Tübingen, he accepted a call to the University of Wittenberg by Martin Luther, recommended by his great-uncle Johann Reuchlin" [[2]] [1] "Paul" [2] " He studied the Scripture, especially of Paul, and Evangelical doctrine" [[3]] [1] "Melanchthon" [2] " Melanchthon became professor of the Greek language in Wittenberg at the age of 21" [3] " Johann Eck having attacked his views, Melanchthon replied based on the authority of Scripture in his Defensio contra Johannem Eckium" [[4]] [1] "(?=.*Melanchthon)(?=.*Scripture)" [2] " Johann Eck having attacked his views, Melanchthon replied based on the authority of Scripture in his Defensio contra Johannem Eckium"
EDIT 5: Answering your other question:
Given:
sentenceR<-"Opposed as a reformer at [[Tübingen]], he accepted a call to the University of [[Wittenberg]] by [[Martin Luther]], recommended by his great-uncle [[Johann Reuchlin]]" gsub("\\[\\[|\\]\\]", "", regmatches(sentenceR, gregexpr("\\[\\[.*?\\]\\]", sentenceR))[[1]])
Gives you words inside double brackets.
> gsub("\\[\\[|\\]\\]", "", regmatches(sentenceR, gregexpr("\\[\\[.*?\\]\\]", sentenceR))[[1]]) [1] "Tübingen" "Wittenberg" "Martin Luther" "Johann Reuchlin"