How to get a list of source variable names from a GLM call in R?

Question

How to get a list of source variable names from a GLM call in R?

When using a function glmin R, formulayou can use functions such as addNAor in the argument log. For example, we have dataframe Datawith 4 columns: Class, var1which are factors and var2, var3which are numerical variables, and we come:

Model <- glm(data  = Data, 
         formula   = Class ~ addNA(var1) + var2+ log(var3),  
         family    = binomial)

In glm, output variable 1 will now be called addNA(var1)(for example, c Model$xlevels), and variable 3 will be called log(var3).

Is it possible to get a list from glm output that indicates that var1, var2 and var3 were extracted from the data framework without adding addNA (var1) or log (var3) to the variable names?

More generally, is it possible to deduce which columns were extracted from the input dataframe using glm before any transformations / cross-terms, etc. generated inside the glm function, after calling the glm call?

+4

r model-fitting feature-selection glm

Herman Sontrop Jan 14 '14 at 13:51

source share

2 answers

call, formula terms. . ( terms, gsub, "(" ")".

+1

Carl Witthoft 14 . '14 13:59

Ben bolker · Accepted Answer · 2014-01-14T14:00:53+0000

It works:

all.vars(formula(Model)[-2])
## [1] "var1" "var2" "var3"

Indexing [-2]removes the response variable from the formula. However, you may be disappointed that the internally stored model frame does not have the source variables, but the converted variables ...

names(model.frame(Model))
## [1] "Class"       "addNA(var1)" "var2"        "log(var3)"

How to get a list of source variable names from a GLM call in R?

More articles: