Is it possible to use spread on multiple columns in tidyr like dcast?

Question

Is it possible to use spread on multiple columns in tidyr like dcast?

I have the following dummy data:

library(dplyr) library(tidyr) library(reshape2) dt <- expand.grid(Year = 1990:2014, Product=LETTERS[1:8], Country = paste0(LETTERS, "I")) %>% select(Product, Country, Year) dt$value <- rnorm(nrow(dt))

I choose two product combinations:

 sdt <- dt %>% filter((Product == "A" & Country == "AI") | (Product == "B" & Country =="EI"))

and I want to see the values next to each other for each combination. I can do this with dcast :

 sdt %>% dcast(Year ~ Product + Country)

Is it possible to do this using spread from the tidyr package?

+61

r tidyr reshape2

mpiktas Jul 24 '14 at 9:27

source share

2 answers

With the new pivot_wider() function introduced in tidyr version 1.0.0, this can be done with a single function call.

pivot_wider() (analog: pivot_longer() ) works similarly to spread() . However, it does offer additional features, such as using multiple key / name columns (and / or multiple value columns). For this purpose, the names_from argument & - indicates from which column (s) the names of the new variables are taken & - can take more than one column name (here Product and Country ).

 library("tidyr") sdt %>% pivot_wider(id_cols = Year, names_from = c(Product, Country)) %>% head(2) #> # A tibble: 2 x 3 #> Year A_AI B_EI #> <int> <dbl> <dbl> #> 1 1990 -2.08 -0.113 #> 2 1991 -1.02 -0.0546

See also: https://tidyr.tidyverse.org/articles/pivot.html

+6

hplieninger May 16 '19 at 9:12

source share

akrun · Accepted Answer · 2014-07-24 09:38

One option would be to create a new “Prod_Count” by connecting the columns “Product” and “Country” to paste , delete these columns with select and change the shape from 'long' to 'wide' using spread from tidyr .

  library(dplyr) library(tidyr) sdt %>% mutate(Prod_Count=paste(Product, Country, sep="_")) %>% select(-Product, -Country)%>% spread(Prod_Count, value)%>% head(2) # Year A_AI B_EI #1 1990 0.7878674 0.2486044 #2 1991 0.2343285 -1.1694878

Or we can avoid a few steps using unite from tidyr (from @beetroot comment) and change the form as before.

  sdt%>% unite(Prod_Count, Product,Country) %>% spread(Prod_Count, value)%>% head(2) # Year A_AI B_EI # 1 1990 0.7878674 0.2486044 # 2 1991 0.2343285 -1.1694878

Is it possible to use spread on multiple columns in tidyr like dcast?

More articles: