Merge two data frames with all combinations

I had a difficult problem: I have two data frames that have one combination no and a specification (color, part) for four different machines. There are a total of 5 specifications for a car (in fact, even more machines and specifications). Here are two examples of data:

df1 <- data.frame( nr=c("000", "000", "000", "001", "002", "002", "003", "004", "004", "004", "005"), Color=c("Red", "Cyan", "Brown", "Blue", "Red", "Green", "DeepBlue", "Orange", "Cyan", "Grey", "Magenta"), mach1=c(1, NA, NA, 1, NA, 1, 1, NA, 1, NA, 1), mach2=c(1, NA, NA, 1, NA, 1, 1, 1, NA, NA, 1), mach3=c(NA, 1, NA, 1, 1, NA, 1, NA, NA, 1, 1), mach4=c(NA, NA, 1, 1, NA, 1, 1, NA, NA, 1, 1)) df2 <- data.frame( nr=c("000", "000", "001", "002", "002", "003", "003", "004", "005", "005"), Part=c("Car", "Tree", "Flag", "Tree", "Road", "Road", "House", "Plane", "House", "Car"), mach1=c(NA, 1, 1, NA, 1, NA, 1, 1, NA, 1), mach2=c(1, NA, 1, NA, 1, NA, 1, 1, 1, NA), mach3=c(NA, 1, 1, 1, NA, 1, NA, 1, 1, NA), mach4=c(NA, 1, 1, 1, NA, 1, NA, 1, 1, NA)) 

So, I have these outputs:

 > df1 nr Color mach1 mach2 mach3 mach4 1 000 Red 1 1 NA NA 2 000 Cyan NA NA 1 NA 3 000 Brown NA NA NA 1 4 001 Blue 1 1 1 1 5 002 Red NA NA 1 NA 6 002 Green 1 1 NA 1 7 003 DeepBlue 1 1 1 1 8 004 Orange NA 1 NA NA 9 004 Cyan 1 NA NA NA 10 004 Grey NA NA 1 1 11 005 Magenta 1 1 1 1 > df2 nr Part mach1 mach2 mach3 mach4 1 000 Car NA 1 NA NA 2 000 Tree 1 NA 1 1 3 001 Flag 1 1 1 1 4 002 Tree NA NA 1 1 5 002 Road 1 1 NA NA 6 003 Road NA NA 1 1 7 003 House 1 1 NA NA 8 004 Plane 1 1 1 1 9 005 House NA 1 1 1 10 005 Car 1 NA NA NA 

Now I would like to combine these two data frames into one new df3 data frame, which displays all the Color and Part combinations for specific machines, such as:

 > df3 nr Color Part mach1 mach2 mach3 mach4 1 000 Red Tree 1 NA NA NA 2 000 Red Car NA 1 NA NA 3 000 Cyan Tree NA NA 1 NA 4 000 Brown Tree NA NA NA 1 5 001 Blue Flag 1 1 1 1 6 002 Green Road 1 1 NA NA 7 002 Red Tree NA NA 1 NA 8 002 Green Tree NA NA NA 1 9 003 Deepblue House 1 1 NA NA 10 003 Deepblue Road NA NA 1 1 11 004 Cyan Plane 1 NA NA NA 12 004 Orange Plane NA 1 NA NA 13 004 Grey Plane NA NA 1 1 14 005 Magenta Car 1 NA NA NA 15 005 Magenta House NA 1 1 1 > 

Any suggestions?

+6
source share
2 answers

You can first change two data frames (using melt from reshape2 ), then merge them (using full_join from dplyr ) and change it back to the original format:

 library(dplyr) library(reshape2) df1.b <- melt(df1) df2.b <- melt(df2) df3 <- full_join(df1.b, df2.b) df3 <- na.omit(df3) df3.b <- dcast(df3, nr + Color + Part ~ variable) nr Color Part mach1 mach2 mach3 mach4 1 000 Brown Tree NA NA NA 1 2 000 Cyan Tree NA NA 1 NA 3 000 Red Car NA 1 NA NA 4 000 Red Tree 1 NA NA NA 5 001 Blue Flag 1 1 1 1 6 002 Green Road 1 1 NA NA 7 002 Green Tree NA NA NA 1 8 002 Red Tree NA NA 1 NA 9 003 DeepBlue House 1 1 NA NA 10 003 DeepBlue Road NA NA 1 1 11 004 Cyan Plane 1 NA NA NA 12 004 Grey Plane NA NA 1 1 13 004 Orange Plane NA 1 NA NA 14 005 Magenta Car 1 NA NA NA 15 005 Magenta House NA 1 1 1 
+6
source

The same idea as @beetroot is implemented using the R reshape() and merge() base:

 reshape(na.omit(merge( reshape(df1,dir='l',idvar=c('nr','Color'),varying=grep('^mach',names(df1)),v.names='mach'), reshape(df2,dir='l',idvar=c('nr','Part'),varying=grep('^mach',names(df2)),v.names='mach') )),dir='w',idvar=c('nr','Color','Part'),sep=''); ## nr Color Part mach1 mach2 mach3 mach4 ## 1 000 Red Tree 1 NA NA NA ## 4 000 Red Car NA 1 NA NA ## 7 000 Cyan Tree NA NA 1 NA ## 10 000 Brown Tree NA NA NA 1 ## 13 001 Blue Flag 1 1 1 1 ## 17 002 Green Road 1 1 NA NA ## 21 002 Red Tree NA NA 1 NA ## 23 002 Green Tree NA NA NA 1 ## 25 003 DeepBlue House 1 1 NA NA ## 27 003 DeepBlue Road NA NA 1 1 ## 29 004 Cyan Plane 1 NA NA NA ## 30 004 Orange Plane NA 1 NA NA ## 31 004 Grey Plane NA NA 1 1 ## 33 005 Magenta Car 1 NA NA NA ## 34 005 Magenta House NA 1 1 1 

(After that, the string names can be corrected if necessary.)

+4
source

All Articles