How to create a dendrogram from a directory tree?

Specifies the root absolute path of the directory. How to create a dendrogram object all the way below it so that I can visualize the directory tree with R?

Suppose the next call returned the following leaf nodes.

list.files(path, full.names = TRUE, recursive = TRUE )

 root/a/some/file.R root/a/another/file.R root/a/another/cool/file.R root/b/some/data.csv root/b/more/data.csv 

I would like to make a graph in R, as the output of a unix tree program:

 root β”œβ”€β”€ a β”‚  β”œβ”€β”€ another β”‚  β”‚  β”œβ”€β”€ cool β”‚  β”‚  β”‚  └── file.R β”‚  β”‚  └── file.R β”‚  └── some β”‚  └── file.R └── b β”œβ”€β”€ more β”‚  └── data.csv └── some └── data.csv 

It would be especially useful if the solution included decomposing the file system tree into two data.frame :

  • node table (with which I could include attributes such as date modified)
  • and edge table (also with attributes)

And then we build the dendrogram object from these two data.frame s.

+12
source share
3 answers

Here's a possible approach to get what you originally asked for - a system like a tree. This will give the data.tree object which is flexible enough and can be created to plot the chart as you possibly want, but I don’t quite understand what you want:

 path <- c( "root/a/some/file.R", "root/a/another/file.R", "root/a/another/cool/file.R", "root/b/some/data.csv", "root/b/more/data.csv" ) library(data.tree); library(plyr) x <- lapply(strsplit(path, "/"), function(z) as.data.frame(t(z))) x <- rbind.fill(x) x$pathString <- apply(x, 1, function(x) paste(trimws(na.omit(x)), collapse="/")) (mytree <- data.tree::as.Node(x)) 1 root 2 Β¦--a 3 Β¦ Β¦--some 4 Β¦ Β¦ Β°--file.R 5 Β¦ Β°--another 6 Β¦ Β¦--file.R 7 Β¦ Β°--cool 8 Β¦ Β°--file.R 9 Β°--b 10 Β¦--some 11 Β¦ Β°--data.csv 12 Β°--more 13 Β°--data.csv plot(mytree) 

You can get the parts you need (I think), but this will require you to do some of the work and data.tree conversion between data types in data.tree : https://cran.r-project.org/web/packages/ data.tree / vignettes /data.tree.html # conversion tree

I use this approach in my pathr package tree function when use.data.tree = TRUE https://github.com/trinker/pathr#tree

EDIT For @Luke's comment below ... data.tree::as.Node takes a path:

 (mytree <- data.tree::as.Node(data.frame(pathString = path))) levelName 1 root2 2 Β¦--a 3 Β¦ Β¦--some 4 Β¦ Β¦ Β°--file.R 5 Β¦ Β°--another 6 Β¦ Β¦--file.R 7 Β¦ Β°--cool 8 Β¦ Β°--file.R 9 Β°--b 10 Β¦--some 11 Β¦ Β°--data.csv 12 Β°--more 13 Β°--data.csv 
+11
source

If you are on Windows, you can use my dir2json package by installing it as follows:

 drat::addRepo("stlarepo") install.packages("dir2json") 

It is also possible to use it on Linux, but there is a DLL associated with the dynamic GHC libraries that must be installed on the system (while this DLL is standalone on Windows).

 > library(dir2json) > cat(dir2tree("src")) src | `- contrib | +- PACKAGES.gz | +- PACKAGES | +- jsonAccess_0.1.1.tar.gz | +- expansions_1.2.tar.gz | `- dir2json_2.1.0.tar.gz > cat(dir2tree("src", vertical=TRUE)) src | contrib | --------------------------------------------------------------------------- / | | | \ PACKAGES.gz PACKAGES jsonAccess_0.1.1.tar.gz expansions_1.2.tar.gz dir2json_2.1.0.tar.gz 

The package also contains the Shiny application, which generates an interactive view of the Reingold-Tilford directory tree:

 > dir2json::shinyDirTree(".") 

Reingold-Tilford Folder

+3
source

It is worth adding that the excellent fs package offers the dir_tree function which delivers this functionality to R in a very convenient way.

 tmp_dir <- tempdir() # Create some directories for (i in 1:10) { dir.create(path = file.path(tmp_dir, basename(tempfile(pattern = "dir")), basename(tempfile(pattern = "sub_dir"))), recursive = TRUE) } # Create directory tree fs::dir_tree(path = tmp_dir, recurse = TRUE) 

results

 /tmp/RtmpEhB0ne β”œβ”€β”€ dir15213121dd5903 β”‚ └── sub_dir1521315a5425ba β”œβ”€β”€ dir152131227b086f β”‚ └── sub_dir1521314255d96b β”œβ”€β”€ dir152131353e6603 β”‚ └── sub_dir1521315b52aeed β”œβ”€β”€ dir15213136870535 β”‚ └── sub_dir15213127b34f64 β”œβ”€β”€ dir1521313bbf738b β”‚ └── sub_dir152131473939ea β”œβ”€β”€ dir152131403f4fd5 β”‚ └── sub_dir152131115296e7 β”œβ”€β”€ dir152131503d0d55 β”‚ └── sub_dir15213114368572 β”œβ”€β”€ dir1521316f0bb0c3 β”‚ └── sub_dir1521314aea266b β”œβ”€β”€ dir1521317fe305e9 β”‚ └── sub_dir152131bcfe8a └── dir1521319800dfb └── sub_dir15213129defd4a 

In addition to printing the directory tree, the detected paths can be returned to the object.

 sink(file = tempfile(fileext = ".log")) res_fs_tree <- fs::dir_tree(path = tmp_dir, recurse = TRUE) sink() res_fs_tree[[1]] # [1] "/tmp/RtmpEhB0ne/dir15213121dd5903/sub_dir1521315a5425ba" 
0
source

All Articles