I have a data frame that I received from
ID <- c("A","A","A","A","B","B","B","B") Type <- c(45,45,46,46,45,45,46,46) Point_A <- c(10,NA,30,40,NA,80,NA,100) Point_B <- c(NA,32,43,NA,65,11,NA,53) df <- data.frame(ID,Type,Point_A,Point_B) ID Type Point_A Point_B 1 A 45 10 NA 2 A 45 NA 32 3 A 46 30 43 4 A 46 40 NA 5 B 45 NA 65 6 B 45 80 11 7 B 46 NA NA 8 B 46 100 53
As long as I found out from this post, I could collapse the data with an identifier and one column.
I am currently using sqldf to sum rows and groups by id and type. Although this does the job for me, it is very slow on a larger dataset.
df1 <- sqldf("SELECT ID, Type, Sum(Point_A) as Point_A, Sum(Point_A) as Point_A FROM df GROUP BY ID, Type")
Please suggest using any other methods that will help solve this problem. I started to study the dplyr and plyr packages, and I find it very interesting, but not knowing how to apply it here.
Desired output
ID Type Point_A Point_B 1 A 45 10 32 2 A 46 70 43 3 B 45 80 76 4 B 46 100 53