Function inside function in R

Question

Function inside function in R

Could you explain to me why the code complains that Samdat not found?

I'm trying to switch between models, so I declared a function that contains these specific models, and I just need to call this function as one of the arguments in the get.f function, where re-fetching will change the structure for each matrix construction in the model. The code complains that Samdat not found when it was found.

Also, is there a way to make a conditional statement if(Model == M1()) instead of creating another argument M to set if(M==1) ?

Here is my code:

 dat <- cbind(Y=rnorm(20),rnorm(20),runif(20),rexp(20),rnorm(20),runif(20), rexp(20),rnorm(20),runif(20),rexp(20)) nam <- paste("v",1:9,sep="") colnames(dat) <- c("Y",nam) M1 <- function(){ a1 = cbind(Samdat[,c(2:5,7,9)]) b1 = cbind(Samdat[,c(2:4,6,8,7)]) c1 = b1+a1 list(a1=a1,b1=b1,c1=c1)} M2 <- function(){ a1= cbind(Samdat[,c(2:5,7,9)])+2 b1= cbind(Samdat[,c(2:4,6,8,7)])+2 c1 = a1+b1 list(a1=a1,b1=b1,c1=c1)} M3 <- function(){ a1= cbind(Samdat[,c(2:5,7,9)])+8 b1= cbind(Samdat[,c(2:4,6,8,7)])+8 c1 = a1+b1 list(a1=a1,b1=b1,c1=c1)} ################################################################# get.f <- function(asim,Model,M){ sse <-c() for(i in 1:asim){ set.seed(i) Samdat <- dat[sample(1:nrow(dat),nrow(dat),replace=T),] Y <- Samdat[,1] if(M==1){ a2 <- Model$a1 b2 <- Model$b1 c2 <- Model$c1 s<- a2+b2+c2 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) coff <-Model$cof sse <-c(sse,coff) } else if(M==2){ a2 <- Model$a1 b2 <- Model$b1 c2 <- Model$c1 s<- c2+12 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) coff <-Model$cof sse <-c(sse,coff) } else { a2 <- Model$a1 b2 <- Model$b1 c2 <- Model$c1 s<- c2+a2 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) coff <- Model$cof sse <-c(sse,coff) } } return(sse) } get.f(10,Model=M1(),M=1) get.f(10,Model=M2(),M=2) get.f(10,Model=M3(),M=3)

+4

r

Falcon-statguy Sep 01 '12 at 12:25

source share

2 answers

You might want to read the R scoping rules. In particular, there is no reason to expect that the variables that you define in a function are visible in other functions.

You may be confused because the global environment (i.e. the top level outside all functions) is an exception to this rule. I will not go into your other questions, but let me notice that the whole script looks very confusing to me, i.e. M1 to M3 is essentially one function, and the copy / paste bundle in get.f definitely terrible. No matter what you're trying to do, you can definitely write in a less confusing way.

Let's look at M first - why not one function with a parameter? Including solving your scope problem, which makes two parameters -

 M <- function(sampleData, offset) { a1 = sampleData[,c(2:5,7,9)] + offset b1 = sampleData[,c(2:4,6,8,7)] + offset c1 = b1+a1 list(a1=a1,b1=b1,c1=c1) }

If you insist on defining aliases, you can also do something like

 M1 <- function(sampleData) M(sampleData, 0) M2 <- function(sampleData) M(sampleData, 2) M3 <- function(sampleData) M(sampleData, 8)

This no longer repeats, but ideally you want the computer to do the repetition for you ( DRY! ):

 offsets <- c(0,2,8) Models <- sapply(offsets, FUN=function(offset) function(sampleData) M(sampleData, offset))

Looking at get.f , it’s not clear what you are trying to do - you are trying to customize something and collect something from the results, but the part about Model$cof refers to the undefined variable, your Model has only a1 , b1 and c1 entries) . Assuming you want to actually collect cof and discard the intermediate code, get.f probably looks like this:

 M <- function(sampleData, offset) { a1 = sampleData[,c(2:5,7,9)] + offset b1 = sampleData[,c(2:4,6,8,7)] + offset c1 = b1+a1 list(a1=a1,b1=b1,c1=c1) } get.f <- function(asim,Model,M){ sse <-c() for(i in 1:asim){ set.seed(i) Samdat <- dat[sample(1:nrow(dat),nrow(dat),replace=T),] Y <- Samdat[,1] model <- Model() if(M==1){ a2 <- model$a1 b2 <- model$b1 c2 <- model$c1 s<- a2+b2+c2 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) sse <-c(sse,cof) } else if(M==2){ a2 <- model$a1 b2 <- model$b1 c2 <- model$c1 s<- c2+12 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) sse <-c(sse,cof) } else { a2 <- model$a1 b2 <- model$b1 c2 <- model$c1 s<- c2+a2 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) sse <-c(sse,cof) } } return(sse) } get.f(10,Model=M1,M=1) get.f(10,Model=M2,M=2) get.f(10,Model=M3,M=3)

This is still terribly repeatable, so why don't we think about it for a minute? All you do with your selections is to compute a single column of them to use according to your needs. I don’t understand why you need to perform the calculation in the M function and then retrieve the only value in get.f (depending on which M you used) - it seems indicative that the extraction should rather be part of M. but if you need to keep them separate, well, let them use separate extraction functions. Still half the size of your code in reasonably written R:

 # Set up test data dat <- cbind(Y=rnorm(20),rnorm(20),runif(20),rexp(20),rnorm(20),runif(20), rexp(20),rnorm(20),runif(20),rexp(20)) nam <- paste("v",1:9,sep="") colnames(dat) <- c("Y",nam) # calculate a1..c1 from a sample M <- function(sampleData, offset) { a1 = sampleData[,c(2:5,7,9)] + offset b1 = sampleData[,c(2:4,6,8,7)] + offset c1 = b1+a1 list(a1=a1,b1=b1,c1=c1) } # create a fixed-offset model from the variable offset model by fixing offset makeModel <- function(offset) function(sampleData) M(sampleData, offset) # run model against asim subsamples of data and collect coefficients get.f <- function(asim,model,expected) sapply(1:asim, function (i){ set.seed(i) Samdat <- dat[sample(1:nrow(dat),nrow(dat),replace=T),] Y <- Samdat[,1] s <- expected(model(Samdat)) fit <- lm(Y~s) sum(summary(fit)$coef[,1]) }) # list of models to run and how to extract the expectation values from the model reslts todo <- list( list(model=makeModel(0), expected=function(data) data$a1+data$b1+data$c1), list(model=makeModel(2), expected=function(data) data$c1+12), list(model=makeModel(8), expected=function(data) data$c1+data$a1)) sapply(todo, function(l) { get.f(10, l$model, l$expected)})

+9

themel Sep 01 '12 at 13:22

source share

flodel · Accepted Answer · 2012-09-01T15:43:44+0000

When you call

 get.f(10, Model=M1(), M=1)

function M1 is called immediately. It dies because inside the body of M1 you use Samdat , which is defined only later, in the body of get.f

Somehow you need to call M1 after determining Samdat . One way to do this is to make M1 (the function) an argument to get.f and call the function from within get.f :

 get.f <- function(asim, Model.fun, M) { ... Sambat <- ... Model <- Model.fun() ... } get.f(10, Model.fun = M1, M=1)

In addition, in general, poor programming uses global variables, i.e. makes your function use variables that are defined outside their scope. Instead, it is recommended that all functions be used as input arguments. There are two such cases in your code: M1 ( M2 and M3 ) use Samdat and get.f uses dat . They should be arguments regarding your functions. Here is a nicer version of your code. I haven't fixed everything, so you need to do a little more to get it working:

 M1 <- function(sampled.data) { a1 <- sampled.data[, c("v1", "v2", "v3", "v4", "v6", "v8")] b1 <- sampled.data[, c("v1", "v2", "v3", "v5", "v7", "v6")] c1 <- a1 + b1 list(a1 = a1, b1 = b1, c1 = c1) } get.f <- function(dat, asim, Model.fun, offset, M) { sse <- c() for(i in 1:asim){ set.seed(i) Samdat <- dat[sample(1:nrow(dat), nrow(dat), replace = TRUE), ] Y <- Samdat[, "Y"] Model <- Model.fun(sampled.data = Samdat) a2 <- Model$a1 b2 <- Model$b1 c2 <- Model$c1 s <- switch(M, a2 + b2 + c2, c2 + 12, c2 + a2) fit <- lm(Y ~ s) cof <- sum(summary(fit)$coef[,1]) coff <- Model$cof # there is a problem here... sse <- c(sse, coff) # this is not efficient } return(sse) } dat <- cbind(Y = rnorm(20), v1 = rnorm(20), v2 = runif(20), v3 = rexp(20), v4 = rnorm(20), v5 = runif(20), v6 = rexp(20), v7 = rnorm(20), v8 = runif(20), v9 = rexp(20)) get.f(dat, 10, Model.fun = M1, M = 1)

Last statement: if the definition of s (what I put together in switch() is related to the Model used, and then I will consider combining the definitions of Model and s together: add s to the list of your functions M1 , M2 , M3 so that s can it was easy to define as s <- Model$s , and then you can remove the input M in get.f

Function inside function in R

More articles: