Create a new column in the data frame using the for loop to calculate the value in R?

I have two data frames df1 and df2:

group=c("Group 1", "Group 2", "Group3","Group 1", "Group 2", "Group3") year=c("2000","2000","2000", "2015", "2015", "2015") items=c("12", "10", "15", "5", "10", "7") df1=data.frame(group, year, items) year=c("2000", "2015") items=c("37", "22") df2=data.frame(year,items) 

df1 contains the number of elements per year and is divided by a group, and df2 contains the total number of elements per year

I am trying to create a for loop that will calculate the proportion of elements for each type of group. I am trying to do something like:

 df1$Prop="" #create empty column called Prop in df1 for(i in 1:nrow(df1)){ df1$Prop[i]=df1$items/df2$items[df2$year==df1$year[i]] } 

where the loop should get the proportion for each element type (by getting the value from df1 and dividing by the total value in df2) and list it in a new column, but this code does not work.

+5
source share
2 answers

You really don't need df2 , here is a simple solution using data.table and only df1 (I assuimg items is a numeric column, if not, you will need to convert it to one setDT(df1)[, items := as.numeric(as.character(items))] )

 library(data.table) setDT(df1)[, Prop := items/sum(items), by = year] df1 # group year items Prop # 1: Group 1 2000 12 0.3243243 # 2: Group 2 2000 10 0.2702703 # 3: Group3 2000 15 0.4054054 # 4: Group 1 2015 5 0.2272727 # 5: Group 2 2015 10 0.4545455 # 6: Group3 2015 7 0.3181818 

Another way: if you already have df2 , you can join them and calculate Prop while doing this (again, I assume that items is numeric in real data)

 setkey(setDT(df1), year)[df2, Prop := items/i.items] 

Alternative R Base

 with(df1, ave(items, year, FUN = function(x) x/sum(x))) ## [1] 0.3243243 0.2702703 0.4054054 0.2272727 0.4545455 0.3181818 
+4
source

dplyr equivalent to David data.table

 library(dplyr) df1$items = as.integer(as.vector(df1$items)) df1 %>% group_by(year) %>% mutate(Prop = items / sum(items)) #Source: local data frame [6 x 4] #Groups: year # group year items Prop #1 Group 1 2000 12 0.3243243 #2 Group 2 2000 10 0.2702703 #3 Group3 2000 15 0.4054054 #4 Group 1 2015 5 0.2272727 #5 Group 2 2015 10 0.4545455 #6 Group3 2015 7 0.3181818 

plyr alternative

 ddply(df1, .(year), mutate, prop = items/sum(items)) 

lapply alternative

 do.call(rbind,lapply(split(df1, df1$year), function(x){ x$prop = x$item / sum(x$item); x})) 
+2
source

All Articles