In R, how to filter lists of lists?

According to the manual, the Filter works on vectors, and it seems to work on lists, for example.

z <- list(a=1, b=2, c=3) Filter(function(i){ z[[i]] > 1 }, z) $b [1] 2 $c [1] 3 

However, it does not work in list listings, for example:

 z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1), z3=list()) Filter(function(i){ if(length(z[[i]])>0){ if(z[[i]]$b > 1) TRUE else FALSE } else FALSE }, z) Error in z[[i]] : invalid subscript type 'list' 

What is the best way to filter list lists without using nested loops? It can also be lists of lists of lists ...

(Instead, I tried using nested applications, but could not get it to work.)

Edit: in the second example, here is what I want to get:

 list(z1=list(a=1,b=2,c=3)) 

that is, without z $ z2, since z $ z2 $ b <1 and without z $ z3, because it is empty.

+4
source share
3 answers

I think you should use:

 Filter(function(x){length(x)>0 && x[["b"]] > 1},z) 

The predicate (the function you use to filter z) applies to z elements, not their indexes.

+3
source

I never used Filter before your question, so it was a good exercise for the first thing in the morning :)

There are at least a few things you can turn off (I think).

Let's start with the first simple anonymous function, but let me make it autonomous so that it is easier to read:

 f <- function(i){ z[[i]] > 1 } 

You should jump out to the fact that this function takes one argument i , but in the function that it calls z . This is not very good "functional" programming :)

So, start by changing this function to:

 f <- function(i){ i > 1 } 

And you will see that Filter will actually run against the list of lists:

  z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1)) Filter( f, z) 

but it returns:

 > Filter( f, z) $z2 $z2$a [1] 1 $z2$b [1] 1 $z2$c [1] 1 $<NA> NULL 

which is not exactly what you want. Honestly, I can’t understand why it returns this result, maybe someone can explain it to me.

@DWin barked the right tree when he said there should be a recursive solution. I cracked the first hit with a recursive function, but you need to improve it:

 fancyFilter <- function(f, x){ if ( is.list( x[[1]] ) ) #only testing the first element... bad practice lapply( x, fancyFilter, f=f ) #recursion FTW!! else return( lapply(x, Filter, f=f ) ) } 

fancyFilter looks at the first x element passed to it, and if that element is a list, it recursively calls fancyFilter for each element of the list. But what if item number 2 is not a list? This is what you need to check and tell if it is important to you. But the result of fancyFilter seems similar to what you need:

 > fancyFilter(f, z) $z1 $z1$a numeric(0) $z1$b [1] 2 $z1$c [1] 3 $z2 $z2$a numeric(0) $z2$b numeric(0) $z2$c numeric(0) 

You can add some logic to clear the output so that the FALSE results are not subjected to harassment of numeric(0) . And, obviously, I made an example using only your simple function, and not the more complex function that you used in the second example.

+1
source

There are no claims to beauty here, and it does not search in depth:

 z2 <- lapply(z, function(x){ if( "b" %in% names(x) && x[["b"]] >1 ) x else {} } ) z2[unlist(lapply(z2, is.null))] <- NULL > z2 $z1 $z1$a [1] 1 $z1$b [1] 2 $z1$c [1] 3 

EDIT: this code will navigate through the list and collect nodes with the name "b"> 1. For the correct labeling of nodes, some work is required. First a list with a deeper nesting:

 z <- list(z1=list(a=1,b=2,c=3), z2=list(a=1,b=1,c=1), z3=list(), z4 = list(z5=list(a=5,b=6,c=7), z6=list(a=7,b=8,c=9))) checkbGT1 <- function(ll){ root <- list() for(i in seq_along(ll) ) {if ("b" %in% names(ll[[i]]) && ll[[i]]$b >1) { root <- c(root, ll[[i]]) }else{ if( length(ll[[i]]) && is.list(ll[[i]]) ) { root <- c(root, list(checkbGT1( ll[[i]] ))) } } } return(root) } 
0
source

All Articles