In R, how to filter lists of lists?

Question

In R, how to filter lists of lists?

According to the manual, the Filter works on vectors, and it seems to work on lists, for example.

z <- list(a=1, b=2, c=3) Filter(function(i){ z[[i]] > 1 }, z) $b [1] 2 $c [1] 3

However, it does not work in list listings, for example:

 z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1), z3=list()) Filter(function(i){ if(length(z[[i]])>0){ if(z[[i]]$b > 1) TRUE else FALSE } else FALSE }, z) Error in z[[i]] : invalid subscript type 'list'

What is the best way to filter list lists without using nested loops? It can also be lists of lists of lists ...

(Instead, I tried using nested applications, but could not get it to work.)

Edit: in the second example, here is what I want to get:

 list(z1=list(a=1,b=2,c=3))

that is, without z $ z2, since z $ z2 $ b <1 and without z $ z3, because it is empty.

+4

list filter r

tflutre Aug 1 '11 at 23:44

source share

3 answers

S4m · Answer 1 · 2012-12-14T14:06:24+0000

I think you should use:

 Filter(function(x){length(x)>0 && x[["b"]] > 1},z)

The predicate (the function you use to filter z) applies to z elements, not their indexes.

Jd long · Answer 2 · 2011-08-02T13:52:47+0000

I never used Filter before your question, so it was a good exercise for the first thing in the morning :)

There are at least a few things you can turn off (I think).

Let's start with the first simple anonymous function, but let me make it autonomous so that it is easier to read:

 f <- function(i){ z[[i]] > 1 }

You should jump out to the fact that this function takes one argument i , but in the function that it calls z . This is not very good "functional" programming :)

So, start by changing this function to:

 f <- function(i){ i > 1 }

And you will see that Filter will actually run against the list of lists:

  z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1)) Filter( f, z)

but it returns:

 > Filter( f, z) $z2 $z2$a [1] 1 $z2$b [1] 1 $z2$c [1] 1 $<NA> NULL

which is not exactly what you want. Honestly, I can’t understand why it returns this result, maybe someone can explain it to me.

@DWin barked the right tree when he said there should be a recursive solution. I cracked the first hit with a recursive function, but you need to improve it:

 fancyFilter <- function(f, x){ if ( is.list( x[[1]] ) ) #only testing the first element... bad practice lapply( x, fancyFilter, f=f ) #recursion FTW!! else return( lapply(x, Filter, f=f ) ) }

fancyFilter looks at the first x element passed to it, and if that element is a list, it recursively calls fancyFilter for each element of the list. But what if item number 2 is not a list? This is what you need to check and tell if it is important to you. But the result of fancyFilter seems similar to what you need:

 > fancyFilter(f, z) $z1 $z1$a numeric(0) $z1$b [1] 2 $z1$c [1] 3 $z2 $z2$a numeric(0) $z2$b numeric(0) $z2$c numeric(0)

You can add some logic to clear the output so that the FALSE results are not subjected to harassment of numeric(0) . And, obviously, I made an example using only your simple function, and not the more complex function that you used in the second example.

42- · Answer 3 · 2011-08-02T02:28:56+0000

There are no claims to beauty here, and it does not search in depth:

 z2 <- lapply(z, function(x){ if( "b" %in% names(x) && x[["b"]] >1 ) x else {} } ) z2[unlist(lapply(z2, is.null))] <- NULL > z2 $z1 $z1$a [1] 1 $z1$b [1] 2 $z1$c [1] 3

EDIT: this code will navigate through the list and collect nodes with the name "b"> 1. For the correct labeling of nodes, some work is required. First a list with a deeper nesting:

 z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1), z3=list(), z4 = list(z5=list(a=5,b=6,c=7), z6=list(a=7,b=8,c=9))) checkbGT1 <- function(ll){ root <- list() for(i in seq_along(ll) ) {if ("b" %in% names(ll[[i]]) && ll[[i]]$b >1) { root <- c(root, ll[[i]]) }else{ if( length(ll[[i]]) && is.list(ll[[i]]) ) { root <- c(root, list(checkbGT1( ll[[i]] ))) } } } return(root) }

In R, how to filter lists of lists?

More articles: