We continue to study this amazing data.table package. I am working on the following data. Table:
demo <- data.table(id = c(1, 2, 3, 4, 5, 6), sex = c(1, 2, 1, 2, 2, 2), agef = c(43, 53, 63, 73, 83, 103)) demo: id sex agef 1 1 43 2 2 53 3 1 63 4 2 73 5 2 83 6 2 103
I am trying to create new columns (stripes age_gender) like ("F0_34", "F35_44", "F45_54", "F55_59" ........ "F95_GT") and ("M0_34", M35_44 "," M45_54 " , "M55_59" ........ "M95_GT") depending on the gender of the column and age, their names and value will be generated. I can do this in a simple way:
demo <- demo[ ,F0_34:= {ifelse((sex==2) & (agef >= 0) & (agef <= 34), 1, 0)}]
But I was looking for an elegant solution for this, and I tried passing age_band as a list in the lapply function, as shown below:
i <- list("0_34","35_44","45_54","55_59","60_64","65_69","70_74","75_79","80_84","85_89","90_94","95_GT") demo[, paste0("F", i) := lapply(i, function(i)lapply(.SD, function(x){ l1 <- unlist(str_split(i, "_")) if(l1[2] == "GT") l1[2] <- 1000 l1 <- as.numeric(l1) score <- ifelse((sex==2) & (agef >= l1[1]) & (agef <= l1[2]), 1, 0) return(score) })), .SDcols = c("sex", "agef"), by = id] demo[, paste0("M", i) := lapply(i, function(i)lapply(.SD, function(x){ l1 <- unlist(str_split(i, "_")) if(l1[2] == "GT") l1[2] <- 1000 l1 <- as.numeric(l1) score <- ifelse((sex==1) & (agef >= l1[1]) & (agef <= l1[2]), 1, 0) return(score) })), .SDcols = c("sex", "agef"), by = id]
I get the desired result:
id sex agef F0_34 F35_44 F45_54 F55_59 F60_64 F65_69 F70_74 F75_79 F80_84 F85_89 F90_94 F95_GT M0_34 M35_44 M45_54 M55_59 M60_64 M65_69 M70_74 M75_79 M80_84 M85_89 M90_94 M95_GT 1 1 43 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 2 53 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 1 63 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 4 2 73 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 2 83 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 2 103 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
but with some warnings:
Warning messages: 1: In `[.data.table`(demographic1, , `:=`(paste0("F", i), ... : RHS 1 is length 2 (greater than the size (1) of group 1). The last 1 element(s) will be discarded.
which I cannot understand, can someone indicate what I am doing wrong?