I use the tm package and want to get Flesch-Kincaid ratings for a document using R. I found that the koRpus package has a lot of metrics, including reading level, and started using it. However, the returned object seems to be a very complex s4 object, I don't understand how to figure it out.
So, I apply this to my body:
txt <- system.file("texts", "txt", package = "tm") (d <- Corpus(DirSource(txt, encoding = "UTF-8"), readerControl = list(language = "lat"))) f <- function(x) tokenize(x, format="obj", lang='en') g <- function(x) flesch.kincaid(x) x <- foreach(i=1:5) %dopar% g(f(d[[i]]))
x is the flesch.kincaid vector applied to Ovid.
> x[[1]] Flesch-Kincaid Grade Level Parameters: default Grade: 13.62 Age: 18.62 Text language: en
How can I get only the return values โโof grade = 13.62, and age = 18.62? Str (x) is so large that it is difficult to parse, i.e.
> str(x[[1]]) Formal class 'kRp.readability' [package "koRpus"] with 49 slots ..@ hyphen :Formal class 'kRp.hyphen' [package "koRpus"] with 3 slots .. .. ..@ lang : chr "en" .. .. ..@ desc :List of 5 .. .. .. ..$ num.syll : num 196 .. .. .. ..$ syll.distrib : num [1:6, 1:4] 25 25 65 27.8 27.8 ... .. .. .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. .. .. ..$ : chr [1:6] "num" "cum.sum" "cum.inv" "pct" ... .. .. .. .. .. ..$ : chr [1:4] "1" "2" "3" "4" .. .. .. ..$ syll.uniq.distrib: num [1:6, 1:4] 15 15 61 19.7 19.7 ... .. .. .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. .. .. ..$ : chr [1:6] "num" "cum.sum" "cum.inv" "pct" ... .. .. .. .. .. ..$ : chr [1:4] "1" "2" "3" "4" .. .. .. ..$ avg.syll.word : num 2.18 .. .. .. ..$ syll.per100 : num 218 .. .. ..@ hyphen:'data.frame': 90 obs. of 2 variables: .. .. .. ..$ syll: num [1:90] 1 1 1 1 2 3 1 2 3 1 ... .. .. .. ..$ word: chr [1:90] "Si" "quis" "in" "hoc" ... ..@ param :List of 1 .. ..$ Flesch.Kincaid: Named num [1:3] 0.39 11.8 15.59 .. .. ..- attr(*, "names")= chr [1:3] "asl" "asw" "const" ..@ ARI :List of 1 .. ..$ : logi NA ..@ ARI.NRI :List of 1 .. ..$ : logi NA ..@ ARI.simple :List of 1 .. ..$ : logi NA ..@ Bormuth :List of 1 .. ..$ : logi NA ..@ Coleman :List of 1 .. ..$ : logi NA ..@ Coleman.Liau :List of 1 .. ..$ : logi NA ..@ Dale.Chall :List of 1 .. ..$ : logi NA ..@ Dale.Chall.PSK :List of 1 .. ..$ : logi NA ..@ Dale.Chall.old :List of 1 .. ..$ : logi NA ..@ Danielson.Bryan :List of 1 .. ..$ : logi NA ..@ Dickes.Steiwer :List of 1 .. ..$ : logi NA ..@ DRP :List of 1 .. ..$ : logi NA ..@ ELF :List of 1 .. ..$ : logi NA ..@ Flesch :List of 1 .. ..$ : logi NA ..@ Flesch.PSK :List of 1 .. ..$ : logi NA ..@ Flesch.de :List of 1 .. ..$ : logi NA ..@ Flesch.es :List of 1 .. ..$ : logi NA ..@ Flesch.fr :List of 1 .. ..$ : logi NA ..@ Flesch.nl :List of 1 .. ..$ : logi NA ..@ Flesch.Kincaid :List of 3 .. ..$ flavour: chr "default" .. ..$ grade : num 13.6 .. ..$ age : num 18.6 ..@ Farr.Jenkins.Paterson :List of 1 .. ..$ : logi NA ..@ Farr.Jenkins.Paterson.PSK:List of 1 .. ..$ : logi NA ..@ FOG :List of 1 .. ..$ : logi NA ..@ FOG.PSK :List of 1 .. ..$ : logi NA ..@ FOG.NRI :List of 1 .. ..$ : logi NA ..@ FORCAST :List of 1 .. ..$ : logi NA ..@ FORCAST.RGL :List of 1 .. ..$ : logi NA ..@ Fucks :List of 1 .. ..$ : logi NA ..@ Harris.Jacobson :List of 1 .. ..$ : logi NA ..@ Linsear.Write :List of 1 .. ..$ : logi NA ..@ LIX :List of 1 .. ..$ : logi NA ..@ RIX :List of 1 .. ..$ : logi NA ..@ SMOG :List of 1 .. ..$ : logi NA ..@ SMOG.de :List of 1 .. ..$ : logi NA ..@ SMOG.C :List of 1 .. ..$ : logi NA ..@ SMOG.simple :List of 1 .. ..$ : logi NA ..@ Spache :List of 1 .. ..$ : logi NA ..@ Spache.old :List of 1 .. ..$ : logi NA ..@ Strain :List of 1 .. ..$ : logi NA ..@ Traenkle.Bailer :List of 1 .. ..$ : logi NA ..@ TRI :List of 1 .. ..$ : logi NA ..@ Wheeler.Smith :List of 1 .. ..$ : logi NA ..@ Wheeler.Smith.de :List of 1 .. ..$ : logi NA ..@ Wiener.STF :List of 1 .. ..$ : logi NA ..@ lang : chr "en" ..@ desc :List of 26 .. ..$ sentences : int 10 .. ..$ words : int 90 .. ..$ letters : Named num [1:12] 492 0 8 9 14 18 14 9 10 6 ... .. .. ..- attr(*, "names")= chr [1:12] "all" "l1" "l2" "l3" ... .. ..$ all.chars : int 692 .. ..$ syllables : Named num [1:5] 196 25 32 25 8 .. .. ..- attr(*, "names")= chr [1:5] "all" "s1" "s2" "s3" ... .. ..$ lttr.distrib : num [1:6, 1:11] 0 0 90 0 0 ... .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. ..$ : chr [1:6] "num" "cum.sum" "cum.inv" "pct" ... .. .. .. ..$ : chr [1:11] "1" "2" "3" "4" ... .. ..$ syll.distrib : num [1:6, 1:4] 25 25 65 27.8 27.8 ... .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. ..$ : chr [1:6] "num" "cum.sum" "cum.inv" "pct" ... .. .. .. ..$ : chr [1:4] "1" "2" "3" "4" .. ..$ syll.uniq.distrib : num [1:6, 1:4] 15 15 61 19.7 19.7 ... .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. ..$ : chr [1:6] "num" "cum.sum" "cum.inv" "pct" ... .. .. .. ..$ : chr [1:4] "1" "2" "3" "4" .. ..$ punct : int 17 .. ..$ conjunctions : int 0 .. ..$ prepositions : int 0 .. ..$ pronouns : int 0 .. ..$ foreign : int 0 .. ..$ TTR : num 0.844 .. ..$ avg.sentc.length : num 9 .. ..$ avg.word.length : num 5.47 .. ..$ avg.syll.word : num 2.18 .. ..$ sntc.per.word : num 0.111 .. ..$ sntc.per100 : num 11.1 .. ..$ lett.per100 : num 547 .. ..$ syll.per100 : num 218 .. ..$ FOG.hard.words : NULL .. ..$ Bormuth.NOL : NULL .. ..$ Dale.Chall.NOL : NULL .. ..$ Harris.Jacobson.NOL: NULL .. ..$ Spache.NOL : NULL ..@ TT.res :'data.frame': 107 obs. of 6 variables: .. ..$ token : chr [1:107] "Si" "quis" "in" "hoc" ... .. ..$ tag : chr [1:107] "word.kRp" "word.kRp" "word.kRp" "word.kRp" ... .. ..$ lemma : chr [1:107] "" "" "" "" ... .. ..$ lttr : num [1:107] 2 4 2 3 5 6 3 5 6 1 ... .. ..$ wclass: chr [1:107] "word" "word" "word" "word" ... .. ..$ desc : chr [1:107] "Word (kRp internal)" "Word (kRp internal)" "Word (kRp internal)" "Word (kRp internal)" ...
Ideally, I would like to assign the FK meta (d) account back to tm.
I would be happy to learn how to understand this object of return and endure its values, but also, if there is another, better, faster way to get an FK score, Iโm all ears!