You might want to read the R scoping rules. In particular, there is no reason to expect that the variables that you define in a function are visible in other functions.
You may be confused because the global environment (i.e. the top level outside all functions) is an exception to this rule. I will not go into your other questions, but let me notice that the whole script looks very confusing to me, i.e. M1 to M3 is essentially one function, and the copy / paste bundle in get.f definitely terrible. No matter what you're trying to do, you can definitely write in a less confusing way.
Let's look at M first - why not one function with a parameter? Including solving your scope problem, which makes two parameters -
M <- function(sampleData, offset) { a1 = sampleData[,c(2:5,7,9)] + offset b1 = sampleData[,c(2:4,6,8,7)] + offset c1 = b1+a1 list(a1=a1,b1=b1,c1=c1) }
If you insist on defining aliases, you can also do something like
M1 <- function(sampleData) M(sampleData, 0) M2 <- function(sampleData) M(sampleData, 2) M3 <- function(sampleData) M(sampleData, 8)
This no longer repeats, but ideally you want the computer to do the repetition for you ( DRY! ):
offsets <- c(0,2,8) Models <- sapply(offsets, FUN=function(offset) function(sampleData) M(sampleData, offset))
Looking at get.f , itโs not clear what you are trying to do - you are trying to customize something and collect something from the results, but the part about Model$cof refers to the undefined variable, your Model has only a1 , b1 and c1 entries) . Assuming you want to actually collect cof and discard the intermediate code, get.f probably looks like this:
M <- function(sampleData, offset) { a1 = sampleData[,c(2:5,7,9)] + offset b1 = sampleData[,c(2:4,6,8,7)] + offset c1 = b1+a1 list(a1=a1,b1=b1,c1=c1) } get.f <- function(asim,Model,M){ sse <-c() for(i in 1:asim){ set.seed(i) Samdat <- dat[sample(1:nrow(dat),nrow(dat),replace=T),] Y <- Samdat[,1] model <- Model() if(M==1){ a2 <- model$a1 b2 <- model$b1 c2 <- model$c1 s<- a2+b2+c2 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) sse <-c(sse,cof) } else if(M==2){ a2 <- model$a1 b2 <- model$b1 c2 <- model$c1 s<- c2+12 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) sse <-c(sse,cof) } else { a2 <- model$a1 b2 <- model$b1 c2 <- model$c1 s<- c2+a2 fit <- lm(Y~s) cof <- sum(summary(fit)$coef[,1]) sse <-c(sse,cof) } } return(sse) } get.f(10,Model=M1,M=1) get.f(10,Model=M2,M=2) get.f(10,Model=M3,M=3)
This is still terribly repeatable, so why don't we think about it for a minute? All you do with your selections is to compute a single column of them to use according to your needs. I donโt understand why you need to perform the calculation in the M function and then retrieve the only value in get.f (depending on which M you used) - it seems indicative that the extraction should rather be part of M. but if you need to keep them separate, well, let them use separate extraction functions. Still half the size of your code in reasonably written R:
# Set up test data dat <- cbind(Y=rnorm(20),rnorm(20),runif(20),rexp(20),rnorm(20),runif(20), rexp(20),rnorm(20),runif(20),rexp(20)) nam <- paste("v",1:9,sep="") colnames(dat) <- c("Y",nam)