I am trying to use dplyr inside a function, passing the column name as a variable, then to use with n_distinct in the sum function.
I understand that programming with dplyr has become easier, with the functions summaryize, organiz_, etc., as described in vignette (nse). I have tried various combinations of interpretations from lazyeval. n_distinct responses with "Input to n_distinct () should be the only variable name from the dataset" (which makes sense, I just have the variable name in the string ...)
This works fine outside the function (the mention is the column name in data.frame):
summarize(data, count=n_distinct(mention))
This was my first effort:
getProportions <- function(datain, id_column) {
overall_total <- summarize(datain, count=n_distinct(id_column))[1,1]
}
getProportions(measures, "mention")
And after reading the NSE documentation and some threads here about programming with dplyr, I tried:
overall_total <- summarize_(datain, count=interp(~n_distinct(var),var=as.name(id_column)))[1,1]
. ? n_distinct_()?
Edit
. , , , . , var part right, sumize(), summaryize(), var = part inter-call. . , .