Dataframe in data frame?

Consider the following example:

df <- data.frame(id=1:10,var1=LETTERS[1:10],var2=LETTERS[6:15])

fun.split <- function(x) tolower(as.character(x))
df$new.letters <- apply(df[ ,2:3],2,fun.split)

df$new.letters.var1
#NULL

colnames(df)
# [1] "id"          "var1"        "var2"        "new.letters"

df$new.letters
#       var1 var2
# [1,]  "a"  "f" 
# [2,]  "b"  "g" 
# [3,]  "c"  "h" 
# [4,]  "d"  "i" 
# [5,]  "e"  "j" 
# [6,]  "f"  "k" 
# [7,]  "g"  "l" 
# [8,]  "h"  "m" 
# [9,]  "i"  "n" 
# [10,] "j"  "o" 

Would someone be so kind and explain what is happening here? New data frame in the data area?

I was expecting this:

colnames(df)
# id var1 var2 new.letters.var1 new.letters.var2
+4
source share
3 answers

The reason is because you assigned one new column to output 2 columns matrixon apply. Thus, the result will be matrixin one column. You can convert it back to normal data.frame with

 do.call(data.frame, df)

, 2 , lapply apply, , . apply matrix , "" . lapply list class

df[paste0('new.letters', names(df)[2:3])] <- lapply(df[2:3], fun.split)
+4

R , , , , , . ? ( 5):

,

.

class(df$new.letters)
[1] "matrix"


str(df)
'data.frame':   10 obs. of  4 variables:
 $ id         : int  1 2 3 4 5 6 7 8 9 10
 $ var1       : Factor w/ 10 levels "A","B","C","D",..: 1 2 3 4 5 6 7 8 9 10
 $ var2       : Factor w/ 10 levels "F","G","H","I",..: 1 2 3 4 5 6 7 8 9 10
 $ new.letters: chr [1:10, 1:2] "a" "b" "c" "d" ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr  "var1" "var2"

, , , . :

colnames(df$new.letters)
[1] "var1" "var2"

, - , - .

, , df, :

names(df)
[1] "id"          "var1"        "var2"        "new.letters"

new.letters dim ( ), var1 var1 . :

attributes(df$new.letters)
$dim
[1] 10  2

$dimnames
$dimnames[[1]]
NULL

$dimnames[[2]]
[1] "var1" "var2"

, , (, , data.frame!).

:

, , print:

methods(print)

print, data.frame . , ( , ) listof.

getS3method("print", "listof")
function (x, ...) 
{
    nn <- names(x)
    ll <- length(x)
    if (length(nn) != ll) 
        nn <- paste("Component", seq.int(ll))
    for (i in seq_len(ll)) {
        cat(nn[i], ":\n")
        print(x[[i]], ...)
        cat("\n")
    }
    invisible(x)
}
<bytecode: 0x101afe1c8>
<environment: namespace:base>

, , , , , if (length(nn) != ll).

+2

@akrun solved 90% of my problem. But I had data.frames buried in data.frames, encoded in data.frames, etc., Not knowing the depth to which this was happening.

In this case, I thought that sharing a recursive solution might be useful for others looking for this thread, like me:

    unnest_dataframes <- function(x) {

        y <- do.call(data.frame, x)

        if("data.frame" %in% sapply(y, class)) unnest_dataframes(y)

        y

    }

    new_data <- unnest_dataframes(df)

Although this in itself sometimes has problems, and it may be useful to separate all the columns of the "data.frame" class with the original data set, and then return cbind () like this:

  # Find all columns that are data.frame
  # Assuming your data frame is stored in variable 'y'
  data.frame.cols <- unname(sapply(y, function(x) class(x) == "data.frame"))
  z <- y[, !data.frame.cols]

  # All columns of class "data.frame"
  dfs <- y[, data.frame.cols]

  # Recursively unnest each of these columns
  unnest_dataframes <- function(x) {
    y <- do.call(data.frame, x)
    if("data.frame" %in% sapply(y, class)) {
        unnest_dataframes(y)
    } else {
        cat('Nested data.frames successfully unpacked\n')
      }
    y
  }

  df2 <- unnest_dataframes(dfs)

  # Combine with original data
  all_columns <- cbind(z, df2)
+2
source

All Articles