load required libraries
library(tm) library(Snowball)
create vector
vec<-c("running runner runs","happyness happies")
create a body from a vector
vec<-Corpus(VectorSource(vec))
It is very important to check the class of our case and keep it, because we want the standard case to understand the functions of R
class(vec[[1]]) vec[[1]] <<PlainTextDocument (metadata: 7)>> running runner runs
this will probably tell you a plain text document
So now we are modifying our faulty stemDocument function. First, we will convert our plain text to a character, and then split the text, apply the stemDocument method, which now works fine and inserts it back. most importantly, we will convert the output to the PlainTextDocument specified by the tm package.
stemDocumentfix <- function(x) { PlainTextDocument(paste(stemDocument(unlist(strsplit(as.character(x), " "))),collapse=' ')) }
now we can use the standard tm_map on our case
vec1 = tm_map(vec, stemDocumentfix)
result
vec1[[1]] <<PlainTextDocument (metadata: 7)>> run runner run
The most important thing you need to remember is to always keep the class of documents in the enclosure. I hope this is a simplified solution to your problem using a function from 2 loaded libraries.
Abhinav jain
source share