Subset data frames within a list based on column classes

Question

Subset data frames within a list based on column classes

I have a very large list consisting of data frames, each list item is a different data frame, where each column consists of different types of variables and data frames of different lengths. I want to multiply the data frames in this list and store only those columns that have classes of “integer” or “numeric”, preserving the structure of the data frames (so it seems to be “no”).

What follows is MRE:

 x1 <- c(1,2,3,4)
y1 <- c(letters[1:4])
z1 <- as.integer(c(0, 1, 0, 1))
df1 <- data.frame(x1,y1,z1)
str(df1)

x2 <- c(0, 1, 2, 3,4 )
y2 <- as.integer(c(0, 1, 0, 1, 0))
z2 <- c(letters[1:5])
df2 <- data.frame(x2,y2,z2)
str(df2)

list12 <- list(df1, df2)
str(list12)

#the following have not worked or returned errors:
#list12<- sapply(list12, function (x) subset(x, select = class %in%        c('character', 'factor'), drop =FALSE))
#Error in match(x, table, nomatch = 0L) : 
#  'match' requires vector arguments 

#list12 <- list12[sapply(list12, function(x) subset(x, select x %in% class is.numeric(x) || is.integer(x))]
#unexpected symbol

#list12 <- list12[, sapply(list12, function(x) is.numeric(x) || is.integer(x))]
#  incorrect number of dimensions

#list12 <- sapply(list12, function(x) subset(x, select = class is.numeric(x) || is.integer(x))
#unexpected symbol

My expected result is a list of two data frames: only columns containing integers or number classes

+4

list r dataframe subset

erasmortg May 27 '15 at 18:01

source share

3 answers

:

lapply(list12,function(x) x[vapply(x,class,"") %in% c("integer","numeric")])

+1

nicola 27 '15 18:04

I like David's answer (+1), but using sapply()it seems to me more natural.

lapply(list12, function(x) x[sapply(x, is.numeric)])

+1

Brandon bertelsen May 27 '15 at 18:17

source share

David Arenburg · Accepted Answer · 2015-05-27T18:07:12+0000

Another option is to use Filterinsidelapply

lapply(list12, Filter, f = is.numeric)
# [[1]]
#   x1 z1
# 1  1  0
# 2  2  1
# 3  3  0
# 4  4  1
# 
# [[2]]
#   x2 y2
# 1  0  0
# 2  1  1
# 3  2  0
# 4  3  1
# 5  4  0

Subset data frames within a list based on column classes

More articles: