Convert data frame to treeNetwork compatible list

Consider the following data frame:

Country Provinces City Zone 1 Canada Newfondland St Johns A 2 Canada PEI Charlottetown B 3 Canada Nova Scotia Halifax C 4 Canada New Brunswick Fredericton D 5 Canada Quebec NA NA 6 Canada Quebec Quebec City NA 7 Canada Ontario Toronto A 8 Canada Ontario Ottawa B 9 Canada Manitoba Winnipeg C 10 Canada Saskatchewan Regina D 

Is there a reasonable way to convert it to a list compatible with treeNetwork (from the networkD3 package) in the form:

 CanadaPC <- list(name = "Canada", children = list( list(name = "Newfoundland", children = list(list(name = "St. John's", children = list(list(name = "A"))))), list(name = "PEI", children = list(list(name = "Charlottetown", children = list(list(name = "B"))))), list(name = "Nova Scotia", children = list(list(name = "Halifax", children = list(list(name = "C"))))), list(name = "New Brunswick", children = list(list(name = "Fredericton", children = list(list(name = "D"))))), list(name = "Quebec", children = list(list(name = "Quebec City"))), list(name = "Ontario", children = list(list(name = "Toronto", children = list(list(name = "A"))), list(name = "Ottawa", children = list(list(name = "B"))))), list(name = "Manitoba", children = list(list(name = "Winnipeg", children = list(list(name = "C"))))), list(name = "Saskatchewan", children = list(list(name = "Regina", children = list(list(name = "D"))))))) 

To build a Reingold-Tilford tree having an arbitrary set of levels:

enter image description here

I tried several suboptimal procedures, including a messy combination of for loops, but I can't get it in the right format.

Ideally, the function will scale to consider the first column as root (the starting point), and the other columns will be different levels of children.


Edit

A similar question was asked on the same topic, and @MrFlick provided an interesting recursive function. The original data frame had a fixed set of levels. I introduced NA to add another level of complexity (an arbitrary set of levels) that is not addressed in @MrFlick's initial solution.


Data

 structure(list(Country = c("Canada", "Canada", "Canada", "Canada", "Canada", "Canada", "Canada", "Canada", "Canada", "Canada"), Provinces = c("Newfondland", "PEI", "Nova Scotia", "New Brunswick", "Quebec", "Quebec", "Ontario", "Ontario", "Manitoba", "Saskatchewan" ), City = c("St Johns", "Charlottetown", "Halifax", "Fredericton", NA, "Quebec City", "Toronto", "Ottawa", "Winnipeg", "Regina" ), Zone = c("A", "B", "C", "D", NA, NA, "A", "B", "C", "D")), class = "data.frame", row.names = c(NA, -10L), .Names = c("Country", "Provinces", "City", "Zone")) 
+7
r networkd3
source share
1 answer

A better strategy for this scenario might be a recursive split() Here's an implementation. First, here are sample data

 dd<-structure(list(Country = c("Canada", "Canada", "Canada", "Canada", "Canada", "Canada", "Canada", "Canada", "Canada", "Canada"), Provinces = c("Newfondland", "PEI", "Nova Scotia", "New Brunswick", "Quebec", "Quebec", "Ontario", "Ontario", "Manitoba", "Saskatchewan" ), City = c("St Johns", "Charlottetown", "Halifax", "Fredericton", NA, "Quebec City", "Toronto", "Ottawa", "Winnipeg", "Regina" ), Zone = c("A", "B", "C", "D", NA, NA, "A", "B", "C", "D")), class = "data.frame", row.names = c(NA, -10L), .Names = c("Country", "Provinces", "City", "Zone")) 

note that "I replaced the strings "NA" with true NA values. Now the function

 rsplit <- function(x) { x <- x[!is.na(x[,1]),,drop=FALSE] if(nrow(x)==0) return(NULL) if(ncol(x)==1) return(lapply(x[,1], function(v) list(name=v))) s <- split(x[,-1, drop=FALSE], x[,1]) unname(mapply(function(v,n) {if(!is.null(v)) list(name=n, children=v) else list(name=n)}, lapply(s, rsplit), names(s), SIMPLIFY=FALSE)) } 

Then we can run

 rsplit(dd) 

It seems to work with test data. The only difference is the order in which the child nodes are located.

+7
source share

All Articles