How to insert dataframe column name into R equation?

I am trying to make the code section more flexible by referencing the column names of the data frame and inserting them into the equation, instead of directly accessing the names. The following example works, although I need to insert the field name directly:

require(e1071) class = c(0.25, 0.34, 0.55) field1 = c(23, 33, 34) field2 = c(44, 55, 32) df = data.frame(class, field1, field2) mysvm = svm(class ~ field1 + field2, data = df) 

The following example does not work, and I do not know why:

 require(e1071) class = c(0.25, 0.34, 0.55) field1 = c(23, 33, 34) field2 = c(44, 55, 32) df = data.frame(class, field1, field2) name1 = names(df)[2] name2 = names(df)[3] mysvm = svm(class ~ name1 + name2, data = df) 

How can I refer to the 2nd and 3rd columns in the data frame and insert them correctly in the equation?

+2
variables r
Sep 20 '14 at 23:47
source share
5 answers

The variable name1 contains a string of characters equal to names(df)[2] , say, "foo" . When svm receives a formula object with the term name1 , it searches for an object with the name name1 and replaces this object with its value. That is, svm trying to "regress" the class variable in the character vector of a single "foo" , which, of course, does not make sense.

One way to solve this problem is to create a formula as a character string, and then convert it to a formula after the fact. Here is the utility function that I use from time to time:

 xyform <- function (y_var, x_vars) { # y_var: a length-one character vector # x_vars: a character vector of object names as.formula(sprintf("%s ~ %s", y_var, paste(x_vars, collapse = " + "))) } 
+2
Sep 21 '14 at
source share

I'm not sure if you would be interested in how the formula reads in the call output, but you can do to evaluate it

 > foo <- function(n1, n2) { as.formula(paste("class~", paste(n1, n2, sep = "+"))) } > foo(name1, name2) # class ~ field1 + field2 # <environment: 0x4d0da58> > svm(foo(name1, name2), data = df) # # Call: # svm(formula = foo(name1, name2), data = df) # # # Parameters: # SVM-Type: eps-regression # SVM-Kernel: radial # cost: 1 # gamma: 0.5 # epsilon: 0.1 # # Number of Support Vectors: 3 
+2
Sep 21 '14 at 0:09
source share

There are 2 options:

Or, you multiply your data.frame file by the column names specified as a parameter, and use the dot note for the left member of your formula:

 svm_func <- function(ll=c("field1","field1"),xx=df){ print(df[,c("class",ll)]) svm(class ~ ., data = df[,c("class",ll)]) } 

Or you use the svm forumla version similar to other solutions, but here I use do.call to generalize the formula to any number of parameters:

 svm_func_form <- function(ll=list("field1","field1"),xx=df){ left_term <- do.call(paste,list(ll,collapse="+")) form <- as.formula(paste("class",left_term,sep="~")) svm(formula =form,data =xx) } 
+2
Sep 21 '14 at 0:21
source share

Here are some ways to pass variables by reference and paste them into the Call formula. The first line is copied from @Richard Scriven function

  fun1 <- function(n1, n2){ form1 <- as.formula(paste("class~", paste(n1, n2, sep = "+"))) do.call("svm", list(form1, quote(df))) } fun1(name1, name2) #Call: #svm(formula = class ~ field1 + field2, data = df) #Parameters: # SVM-Type: eps-regression # SVM-Kernel: radial # cost: 1 # gamma: 0.5 # epsilon: 0.1 #Number of Support Vectors: 3 

or

  fun2 <- function(n1, n2){ form1 <- as.formula(paste("class~", paste(n1, n2, sep="+"))) eval(substitute(svm(f, df), list(f = form1))) } fun2(name1, name2) #Call: #svm(formula = class ~ field1 + field2, data = df) #Parameters: # SVM-Type: eps-regression # SVM-Kernel: radial # cost: 1 # gamma: 0.5 # epsilon: 0.1 #Number of Support Vectors: 3 

Or you can pass the @Rchard Scriven function as an argument in fun3

  fun2New <- function(n1, n2){ as.formula(paste("class~", paste(n1, n2, sep="+"))) } fun3 <- function(formula, data, ...){ Call <- match.call(expand.dots = TRUE) Call[[1]] <- as.name("svm") Call$formula <- as.formula(terms(formula)) eval(Call) } fun3(fun2New(name1, name2), df) #Call: #svm(formula = class ~ field1 + field2, data = df) #Parameters: # SVM-Type: eps-regression # SVM-Kernel: radial # cost: 1 # gamma: 0.5 # epsilon: 0.1 #Number of Support Vectors: 3 
+2
Sep 21 '14 at 7:39
source share

Use your own code, just use get (name1) instead of name1!

 > mysvm = svm(class ~ get(name1) + get(name2), data = df) > mysvm Call: svm(formula = class ~ get(name1) + get(name2), data = df) Parameters: SVM-Type: eps-regression SVM-Kernel: radial cost: 1 gamma: 0.5 epsilon: 0.1 Number of Support Vectors: 3 
+1
Sep 21 '14 at 3:04 on
source share



All Articles