How to take a union of an element in a nested list from R

I have a nested list in say lst (all elements have class int ). I do not know the length of lst in advance; however, I know that each lst element is a list of length say k

 length(lst[[i]]) # this equals k and is known in advance, # this is true for i = 1 ... length(lst) 

How to take union first element, the second element, ..., of the k-th element of all lst elements

In particular, if the length of lst is n , I want (not an R code):

 # I know that union can only be taken for 2 elements, # following is for illustration purposes listUnion1 <- union(lst[[1, 1]], lst[[2, 1]], ..., lst[[n, 1]]) listUnion2 <- union(lst[[1, 2]], lst[[2, 2]], ..., lst[[n, 2]]) . . . listUnionk <- union(lst[[1, k]], lst[[2, k]], ..., lst[[n, k]]) 

Any help or pointers are appreciated.

Here is a dataset that can be used, n = 3 and k = 2

 list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), structure(list(a = 12, b = 12), .Names = c("a", "b"))) 
+4
source share
5 answers

Here is a general solution, similar to the spirit of @Ramnath, but avoiding using union() , which is a binary function. The trick is that union() is implemented as:

 unique(c(as.vector(x), as.vector(y))) 

and a bit inside unique() can be achieved by including n th each list in the list.

Then the complete solution:

 unionFun <- function(n, obj) { unique(unlist(lapply(obj, `[[`, n))) } lapply(seq_along(lst[[1]]), FUN = unionFun, obj = lst) 

which gives:

 [[1]] [1] 1 2 3 4 5 6 7 8 9 10 11 12 [[2]] [1] 6 7 8 9 10 11 1 2 3 4 5 12 

according to the data you showed.

Some useful features:

  • we use `[[` for a subset of obj in unionFun . This is similar to function(x) x$a in @Ramnath Answer. However, we do not need an anonymous function (instead, we use `[[` ). Equivalent to @Ramnath Answer: lapply(lst, `[[`, 1)
  • to summarize the above, we will replace 1 above with n in unionFun() and allow the transfer of our list as an argument to obj .

Now that we have a function that will combine the n th elements of this list, we can lapply() at the indices k , applying our unionFun() to each lst subelement, using the fact that the length lst[[1]] matches length(lst[[k]]) for all k .

If this helps to get the element names n th in the returned object, we can do:

 > unions <- lapply(seq_along(lst[[1]]), FUN = unionFun, obj = lst) > names(unions) <- names(lst[[1]]) > unions $a [1] 1 2 3 4 5 6 7 8 9 10 11 12 $b [1] 6 7 8 9 10 11 1 2 3 4 5 12 
+4
source

Here is one solution

 # generate dummy data x1 = sample(letters[1:5], 20, replace = T) x2 = sample(letters[1:5], 20, replace = T) df = data.frame(x1, x2, stringsAsFactors = F) # find unique elements in each column union_df = apply(df, 2, unique) 

Let me know if this works.

EDIT: here is a solution for lists using the data you provide

 mylist = list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), structure(list(a = 12, b = 12), .Names = c("a", "b"))) list_a = lapply(mylist, function(x) x$a) list_b = lapply(mylist, function(x) x$b) union_a = Reduce(union, list_a) union_b = Reduce(union, list_b) 

If you have more than two items in your list, we can summarize this code.

+3
source

Here's another way: use do.call/rbind to align lists by "name" in the data frame, and then apply unique/do.call to each column of this data frame. (I changed your data a bit, so the unions "a" and "b" have different lengths to make sure they work correctly).

 lst <- list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), structure(list(a = 6:10, b = 1:5), .Names = c("a", "b")), structure(list(a = 12, b = 12), .Names = c("a", "b"))) > apply(do.call(rbind, lst),2, function( x ) unique( do.call( c, x))) $a [1] 1 2 3 4 5 6 7 8 9 10 12 $b [1] 6 7 8 9 10 11 1 2 3 4 5 12 
+2
source

Your data

 df <- list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), structure(list(a = 12, b = 12), .Names = c("a", "b"))) 

This gives you unique values ​​for nested lists:

 library(plyr) df.l <- llply(df, function(x) unlist(unique(x))) R> df.l [[1]] [1] 1 2 3 4 5 6 7 8 9 10 11 [[2]] [1] 6 7 8 9 10 11 1 2 3 4 5 [[3]] [1] 12 

EDIT

Thanks to Ramnath, I changed the code a bit and hope this answer matches the needs of your question. For illustration, I also retain the previous answer. A little modified data now has an extra list.

 df <- list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), structure(list(a = 12, b = 12, c = 10:14), .Names = c("a", "b", "c"))) fx <- function(x.list) { x.names <- names(x.list) i <- combn(x.names, 2) l <- apply(i, 2, function(y) x.list[y]) llply(l, unlist) } 

Now you can apply this function to your data.

 all.l <- llply(df, fx) llply(all.l, function(x) llply(x, unique)) R> [[1]] [[1]][[1]] [1] 1 2 3 4 5 6 7 8 9 10 11 [[2]] [[2]][[1]] [1] 6 7 8 9 10 11 1 2 3 4 5 [[3]] [[3]][[1]] [1] 12 [[3]][[2]] [1] 12 10 11 13 14 [[3]][[3]] [1] 12 10 11 13 14 

However, the nested structure is not very user friendly. It may be slightly modified ...

+1
source

According to the documentation, "unlist" is a recursive function, so regardless of the nesting level of the provided lists, you can get all the elements by passing them to the list. You can get a union of subscriptions as follows.

 lst <- list(structure(list(a = 1:5, b = 6:11), .Names = c("a", "b")), structure(list(a = 6:11, b = 1:5), .Names = c("a", "b")), structure(list(a = 12, b = 12), .Names = c("a", "b"))) lapply(lst, function(sublst) unique(unlist(sublst))) [[1]] [1] 1 2 3 4 5 6 7 8 9 10 11 [[2]] [1] 6 7 8 9 10 11 1 2 3 4 5 [[3]] [1] 12 
0
source

All Articles