Using a spread to create two columns of values with a tidir

Question

Using a spread to create two columns of values with a tidir

I have a data frame that looks something like this (see link). I would like to draw a conclusion, which will be created below, and take another step, spreading the tonality variable in both n and the middle variables. It looks like this topic can carry it, but I can't get it to work: Is it possible to use spread over several columns in a tidyr like dcast?

I would like the destination table to have the source variable in one column, then the variables tone-n and tone-avg were in the columns. Therefore, I would like the column headings to be "source" - "For-n" - "Against-n" "For -Avg" - "Against-Avg". This is for publication, not for further calculations, therefore it concerns the presentation of data. It seems to me more intuitive to present data in this way. Thank.

#variable1
Politician.For<-sample(seq(0,4,1),50, replace=TRUE)
#variable2
Politician.Against<-sample(seq(0,4,1),50, replace=TRUE)
#Variable3
Activist.For<-sample(seq(0,4,1),50,replace=TRUE)
#variable4
Activist.Against<-sample(seq(0,4,1),50,replace=TRUE)
#dataframe
df<-data.frame(Politician.For, Politician.Against, Activist.For,Activist.Against)

#tidyr
df %>%
 #Gather all columns 
 gather(df) %>%
 #separate by the period character 
 #(default separation character is non-alpha numeric characterr) 
 separate(col=df, into=c('source', 'tone')) %>%
 #group by both source and tone  
 group_by(source,tone) %>%
 #summarise to create counts and average
 summarise(n=sum(value), avg=mean(value)) %>%
 #try to spread
 spread(tone, c('n', 'value'))

+4

r tidyr spread

spindoctor May 11 '15 at 18:46

source share

2 answers

Using syntax data.table(thanks @akrun):

library(data.table)
dcast(
  setDT(melt(df))[,c('source', 'tone'):=
      tstrsplit(variable, '[.]')
    ][,list(
      N  = sum(value),
      avg= mean(value))
    ,by=.(source, tone)],
  source~tone,
  value.var=c('N','avg'))

+1

Frank May 11, '15 at 19:12

source share

user295691 · Accepted Answer · 2015-05-11T19:12:48+0000

I think you need a different fee to break the bill and mean as separate observations, gather(type, val, -source, -tone)below.

gather(df, who, value) %>%
    separate(who, into=c('source', 'tone')) %>%
    group_by(source, tone) %>%
    summarise(n=sum(value), avg=mean(value)) %>%
    gather(type, val, -source, -tone) %>%
    unite(stat, c(tone, type)) %>%
    spread(stat, val)

Productivity

Source: local data frame [2 x 5]

      source Against_avg Against_n For_avg For_n
1   Activist        1.82        91    1.84    92
2 Politician        1.94        97    1.70    85

Using a spread to create two columns of values ​​with a tidir

More articles:

Using a spread to create two columns of values with a tidir