Why does a class change from integer to character when indexing a data frame using a numeric matrix?

If I index the data.frame of all matrix integers, I get the expected result.

df <- data.frame(c1=1:4, c2=5:8) df1 # c1 c2 #1 1 5 #2 2 6 #3 3 7 #4 4 8 df1[matrix(c(1:4,1,2,1,2), nrow=4)] # [1] 1 6 3 8 

If there is a character column in data.frame, the result will be all characters, although I only index entire columns.

 df2 <- data.frame(c0=letters[1:4], c1=1:4, c2=5:8) df2 # c0 c1 c2 #1 a 1 5 #2 b 2 6 #3 c 3 7 #4 d 4 8 df2[matrix(c(1:4,2,3,2,3), nrow=4)] # [1] "1" "6" "3" "8" class(df[matrix(c(1:4,2,3,2,3), nrow=4)]) # [1] "character" df2[1,2] # [1] 1 

My best guess: R is too busy to go through the answer to check if they all came from a particular class. Can someone explain why this is happening?

+5
source share
1 answer

?Extract describes how indexing with a numeric matrix is ​​for matrices and arrays. Therefore, it would be surprising that such indexing worked for the data frame in the first place.

However, if we look at the code for [.data.frame ( getAnywhere(`[.data.frame`) ), we will see that when extracting elements from data.frame using matrix in i , data.frame first forces a matrix with as.matrix :

 function (x, i, j, drop = if (missing(i)) TRUE else length(cols) == 1) { # snip if (Narg < 3L) { # snip if (is.matrix(i)) return(as.matrix(x)[i]) 

Then look at ?as.matrix :

"The data frame method will return a character matrix if there are only atomic columns and any non (numeric / logical / complex) column."

Thus, since the first column in "df2" has the character class, as.matrix will force the entire data frame to the character matrix before retrieval occurs.

+4
source

All Articles