How does R handle an object in a function call?

Question

How does R handle an object in a function call?

I have a Java and Python background and I am learning R recently.

Today I discovered that R seems to handle objects in a completely different way than Java and Python.

For example, the following code:

x <- c(1:10) print(x) sapply(1:10,function(i){ x[i] = 4 }) print(x)

The code gives the following result:

 [1] 1 2 3 4 5 6 7 8 9 10 [1] 1 2 3 4 5 6 7 8 9 10

But I expect the second line of output to be "4", since I changed the vector in the sapply function.

Does this mean that R creates copies of objects in a function call instead of referencing objects?

+8

object r

Spirit zhang Sep 24 '11 at 5:37

source share

5 answers

Yes you are right. Check R language definition: 4.3.3 Evaluation of arguments

AFAIK, R does not actually copy the data until you try to change it, following the Copy-on-write semantics.

+7

Anatoliy Sep 24 '11 at 6:09

source share

x located inside an anonymous function, not x in the global environment (in the workspace). This is a copy of x local to the anonymous function. It is not so simple to say that R copies objects into function calls; R will strive not to copy if possible, but as soon as you change something that R should copy the object.

As @DWin points out, this copied version of x that has been modified , returned by calling sapply() , your stated result is not what I get:

 > x <- c(1:10) > print(x) [1] 1 2 3 4 5 6 7 8 9 10 > sapply(1:10,function(i){ + x[i] = 4 + }) [1] 4 4 4 4 4 4 4 4 4 4 > print(x) [1] 1 2 3 4 5 6 7 8 9 10

Obviously, the code did almost what you thought. The problem is that the output from sapply() not assigned to the object and therefore printed and then discarded.

The reason you clone even the work is due to the rules for defining the domain R. You really have to go to the function as arguments to any objects that need the function. However, if R cannot find a local object for the function, it will look for the parent environment for the object corresponding to the name, and then the parent element of this environment, if necessary, eventually ending up in the global environment - the workspace. This way your code works because it eventually found x to work, but was immediately copied, this copy returned at the end of the sapply() call.

This copying in many cases requires time and memory. This is one of the reasons why people think that for loops are slow in R; they do not allocate storage for the object before filling it with a loop. If you do not allocate memory, R must modify / copy the object to add the next loop result.

Again, this is not always so easy everywhere in R, for example, in environments where a copy of the environment really just refers to the original version:

 > a <- new.env() > a <environment: 0x1af2ee0> > b <- 4 > assign("b", b, env = a) > a$b [1] 4 > c <- a ## copy the environment to `c` > assign("b", 7, env = c) ## assign something to `b` in env `c` > c$b ## as expected [1] 7 > a$b ## also changed `b` in `a` as `a` and `c` are actually the same thing [1] 7

If you understand such things, read the R Language Definition manual, which covers many of the details of what happens under the hood in R.

+3

Gavin simpson Sep 24 '11 at 18:09

source share

You need to assign the output sapply to the object, otherwise it just disappears. (In fact, you can restore it, as it is also assigned to .Last.value )

 x <- c(1:10) print(x) [1] 1 2 3 4 5 6 7 8 9 10 x <- sapply(1:10,function(i){ x[i] = 4 }) print(x) [1] 4 4 4 4 4 4 4 4 4 4

0

42- Sep 24 '11 at 6:44

source share

If you want to change the "global" object from within the function, you can use non-local assignment.

 x <- c(1:10) # [1] 1 2 3 4 5 6 7 8 9 10 print(x) sapply(1:10,function(i){ x[i] <<- 4 }) print(x) # [1] 4 4 4 4 4 4 4 4 4 4

Although in this particular case you could just have it more compactly as x[]<-4

That is, by the way, one of the nice functions of R is instead of sapply(1:10,function(i) x[i] <<- 4 or for(i in 1:10) x[i]<-4 ( for not function, so you don’t need <<- ) here, you can just write x[]<-4 :)

0

lebatsnok Sep 19 '13 at 20:50

source share

G. grothendieck · Accepted Answer · 2011-09-24T11:47:43+0000

x defined in the global environment, not in your function.

If you try to change a non-local object, for example x in a function, then R creates a copy of the object and modifies the copy, so every time you run an anonymous function, a copy of x is created and its i-th component is set to 4. When the function completes the executed copy, she disappears forever. Original x does not change.

If we were to write x[i] <<- i , or if we were to write x[i] <- 4; assign("x", x, .GlobalEnv) x[i] <- 4; assign("x", x, .GlobalEnv) , then R will write it back. Another way to write it is to set e , say, in the environment in which x is stored, and do this:

 e <- environment() sapply(1:10, function(i) e$x[i] <- 4)

or perhaps this:

 sapply(1:10, function(i, e) e$x[i] <- 4, e = environment())

Typically, this code is not written to R. Rather, it produces the result as an output of a function like this:

 x <- sapply(1:10, function(i) 4)

(Actually, in this case, you can write x[] <- 4 )

ADDED:

Using a flow package, you can do this when the f method sets the ith component of property x to 4.

 library(proto) p <- proto(x = 1:10, f = function(., i) .$x[i] <- 4) for(i in seq_along(p$x)) p$f(i) p$x

ADDED:

Added another parameter above, in which we explicitly pass the environment in which x is stored.

How does R handle an object in a function call?

More articles: