It seems that the problem is that R is doing some kind of complicated thing when you assign values to the returned function. For example, something like
a <- c(1,3) names(a) <- c("one", "three")
will look very strange in most languages. How to assign a value to the return value of a function? What really happens is that a function is defined called names<- . Basically, this is returning a modified version of the original object, which can then be used to replace the value passed to this function. So it really looks like
.temp. <- `names<-`(a, c("one","three")) a <- .temp.
The variable a always completely replaced, not just its names.
When you do something like
dfSet$a<-1
what really happens again
.temp. <- "$<-"(dfSet, a, 1) dfSet <- .temp.
Now things get a little more complicated when you try to execute a subset of [] and $ . Look at this sample
#for subsetting f <- function(x,v) {print("testing"); x==v} x <- rep(0:1, length.out=nrow(dfSet)) dfSet$a <- 0 dfSet[f(x,1),]$a<-1
Please note that “testing” is printed twice. What happens is really more like
.temp1. <- "$<-"(dfSet[f(x,1),], a, 1) .temp2. <- "[<-"(dfSet, f(x,1), , .temp1.) dfSet <- .temp2.
So f(x,1) is estimated twice. This means that sample will be evaluated twice.
The error is more obvious, you are trying to replace a variable that does not exist yet
dfSet[f(x,1),]$b<-1
You will get a warning here because the .temp1. variable .temp1. added a column and now has 4 columns, but when you try to assign .temp2. , now you have a problem that the slice of the data frame that you are trying to replace is a different size.
Identifiers are replaced because the $<- operator does not just return a new column, it returns a new data.frame with an updated column to any value you assign. This means that the rows that were updated are returned with the identifier that was there when the assignment occurred. This is stored in the .temp1. variable .temp1. . Then, when you perform the assignment [<- , you select a new rowset for swapping. The values of all the columns of these rows are replaced by the values from .temp1. . This means that you will overwrite the identifiers for the replacement strings, and they may be different, so you will probably end up with two or more copies of this identifier.