Aligning the lengths of all lists in a list?

Question

Aligning the lengths of all lists in a list?

I have a list of lists, and I want the sub-lists for all to be the same length

i.e. to place them with NA , if necessary, so that they reach the length of the longest list.

Layout Example

 list1 <- list(1, 2, 3) list2 <- list(1, 2, 3, 4, 5) list3 <- list(1, 2, 3, 4, 5, 6) list_lists <- list(list1, list2, list3)

My best attempt yet

 max_length <- max(unlist(lapply (list_lists, FUN = length))) # returns the length of the longest list list_lists <- lapply (list_lists, function (x) length (x) <- max_length)

The problem is replacing all my subscriptions with an integer = max_length ...

 list_lists [[1]] > [1] 6

Can anyone help?

+8

list r

francoiskroll Apr 14 '17 at 16:34

source share

5 answers

Try this (where ls is your list):

 lapply(lapply(sapply(ls, unlist), "length<-", max(lengths(ls))), as.list)

+5

989 Apr 14 '17 at 17:08

source share

In lists, NULL looks more appropriate than NA , and it can be added using vector :

 list_lists <- list(list(1, 2, 3), list(1, 2, 3, 4, 5), list(1, 2, 3, 4, 5, 6)) list_lists2 <- Map(function(x, y){c(x, vector('list', length = y))}, list_lists, max(lengths(list_lists)) - lengths(list_lists)) str(list_lists2) #> List of 3 #> $ :List of 6 #> ..$ : num 1 #> ..$ : num 2 #> ..$ : num 3 #> ..$ : NULL #> ..$ : NULL #> ..$ : NULL #> $ :List of 6 #> ..$ : num 1 #> ..$ : num 2 #> ..$ : num 3 #> ..$ : num 4 #> ..$ : num 5 #> ..$ : NULL #> $ :List of 6 #> ..$ : num 1 #> ..$ : num 2 #> ..$ : num 3 #> ..$ : num 4 #> ..$ : num 5 #> ..$ : num 6

If you really want NA s, just change vector to rep :

 list_lists3 <- Map(function(x, y){c(x, rep(NA, y))}, list_lists, max(lengths(list_lists)) - lengths(list_lists)) str(list_lists3) #> List of 3 #> $ :List of 6 #> ..$ : num 1 #> ..$ : num 2 #> ..$ : num 3 #> ..$ : logi NA #> ..$ : logi NA #> ..$ : logi NA #> $ :List of 6 #> ..$ : num 1 #> ..$ : num 2 #> ..$ : num 3 #> ..$ : num 4 #> ..$ : num 5 #> ..$ : logi NA #> $ :List of 6 #> ..$ : num 1 #> ..$ : num 2 #> ..$ : num 3 #> ..$ : num 4 #> ..$ : num 5 #> ..$ : num 6

Note that the types in the latter do not match unless you specify NA_real_ or coerce NA to match type x .

+3

alistaire Apr 14 '17 at 18:26

source share

Try the following:

 funJoeOld <- function(ls) { list_length <- sapply(ls, length) max_length <- max(list_length) lapply(seq_along(ls), function(x) { if (list_length[x] < max_length) { c(ls[[x]], lapply(1:(max_length - list_length[x]), function(y) NA)) } else { ls[[x]] } }) } funJoeOld(list_lists)[[1]] [[1]] [1] 1 [[2]] [1] 2 [[3]] [1] 3 [[4]] [1] NA [[5]] [1] NA [[6]] [1] NA

Edit

I just wanted to highlight how using the right tools in R is of the utmost importance. Although my solution gives the correct results, it is very inefficient. Replacing sapply(ls, length) with lengths , as well as lapply(1:z, function(y) NA) with as.list(rep(NA, z)) , we get almost 15-fold acceleration. Note:

 funJoeNew <- function(ls) { list_length <- lengths(ls) max_length <- max(list_length) lapply(seq_along(ls), function(x) { if (list_length[x] < max_length) { c(ls[[x]], as.list(rep(NA, max_length - list_length[x]))) } else { ls[[x]] } }) } funAlistaire <- function(ls) { Map(function(x, y){c(x, rep(NA, y))}, ls, max(lengths(ls)) - lengths(ls)) } fun989 <- function(ls) { lapply(lapply(sapply(ls, unlist), "length<-", max(lengths(ls))), as.list) }

Compare Equality

 set.seed(123) samp_list <- lapply(sample(1000, replace = TRUE), function(x) {lapply(1:x, identity)}) ## have to unlist as the NAs in 989 are of the integer ## variety and the NAs in Joe/Alistaire are logical identical(sapply(fun989(samp_list), unlist), sapply(funJoeNew(samp_list), unlist)) [1] TRUE identical(funJoeNew(samp_list), funAlistaire(samp_list)) [1] TRUE

Benchmarks

 microbenchmark(funJoeOld(samp_list), funJoeNew(samp_list), fun989(samp_list), funAlistaire(samp_list), times = 30, unit = "relative") Unit: relative expr min lq mean median uq max neval cld funJoeOld(samp_list) 21.825878 23.269846 17.434447 20.803035 18.851403 4.8056784 30 c funJoeNew(samp_list) 1.827741 1.841071 2.253294 1.667047 1.780324 2.4659653 30 ab fun989(samp_list) 3.108230 3.563780 3.170320 3.790048 3.888632 0.9890681 30 b funAli(samp_list) 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000 30 a

There are two options:

A good understanding of the apply family of functions makes for concise and efficient code (as can be seen from the @alistaire and @ 989 solutions).
Understanding the nuances of base R as a whole can have significant consequences.

+2

Joseph Wood Apr 14 '17 at 16:45

source share

Not sure if you are looking for this and you can use the lengths function for lists:

 list_lists <- list(unlist(list1), unlist(list2), unlist(list3)) list_lists1 <- lapply(list_lists, `length<-`, max(lengths(list_lists))) list_lists1 > list_lists1 [[1]] [1] 1 2 3 NA NA NA [[2]] [1] 1 2 3 4 5 NA [[3]] [1] 1 2 3 4 5 6

OR for list listings, you can go to the next step:

 list_lists2 <- lapply(list_lists1,as.list) > list_lists2 [[1]] [[1]][[1]] [1] 1 [[1]][[2]] [1] 2 [[1]][[3]] [1] 3 [[1]][[4]] [1] NA [[1]][[5]] [1] NA [[1]][[6]] [1] NA [[2]] [[2]][[1]] [1] 1 [[2]][[2]] [1] 2 [[2]][[3]] [1] 3 [[2]][[4]] [1] 4 [[2]][[5]] [1] 5 [[2]][[6]] [1] NA [[3]] [[3]][[1]] [1] 1 [[3]][[2]] [1] 2 [[3]][[3]] [1] 3 [[3]][[4]] [1] 4 [[3]][[5]] [1] 5 [[3]][[6]] [1] 6 >

+1

PKumar Apr 14 '17 at 16:45

source share

Andrey Shabalin · Accepted Answer · 2017-04-14T16:58:00+0000

Here is your code fixed. The function should return x , not length(x) . In addition, I used vectors, not lists, for clarity.

 list1 <- c(1, 2, 3) list2 <- c(1, 2, 3, 4, 5) list3 <- c(1, 2, 3, 4, 5, 6) list_lists <- list(list1, list2, list3) max_length <- max(unlist(lapply (list_lists, FUN = length))) list_lists <- lapply (list_lists, function (x) {length (x) <- max_length;x}) # [[1]] # [1] 1 2 3 NA NA NA # # [[2]] # [1] 1 2 3 4 5 NA # # [[3]] # [1] 1 2 3 4 5 6

For source lists, the result is:

 # [[1]] # [[1]][[1]] # [1] 1 # # [[1]][[2]] # [1] 2 # # [[1]][[3]] # [1] 3 # # [[1]][[4]] # NULL # # [[1]][[5]] # NULL # # [[1]][[6]] # NULL # # # [[2]] # [[2]][[1]] # [1] 1 # # [[2]][[2]] # [1] 2 # # [[2]][[3]] # [1] 3 # # [[2]][[4]] # [1] 4 # # [[2]][[5]] # [1] 5 # # [[2]][[6]] # NULL # # # [[3]] # [[3]][[1]] # [1] 1 # # [[3]][[2]] # [1] 2 # # [[3]][[3]] # [1] 3 # # [[3]][[4]] # [1] 4 # # [[3]][[5]] # [1] 5 # # [[3]][[6]] # [1] 6

Aligning the lengths of all lists in a list?

Edit

More articles: