I see some unexpected merge behavior (or at least not completely intuitive). But maybe I just donβt understand how this should work:
Let some dummy data be created first:
x <- structure(list(A = c(2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L), B = c(2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L ), C = c(2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L), D = c(2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L), E = c(2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), F = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L), G = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L), H = c(1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L), I = c(1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L), J = c(2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L), K = c(3, 3, 1, 3, 1, 3, 1, 2, 2, 2, 1, 3, 2, 2, 2, 1, NA, 1, 2, 1)), .Names = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K"), row.names = c(NA, 20L), class = "data.frame") # Generate Listing of All Possible Combinations y <- list(1:2); y = expand.grid(rep(y,10)); colnames(y) <- LETTERS[1:10] y <- rbind(y,y,y) y$K <- rep(1:3,each=1024) y$mergekey <- sample(1:6,3072,replace=TRUE)
My expectation is that when I combine the two datasets that set sort=FALSE and all.x=TRUE , I would provide me with a list of all x with mergekey .
Try the following:
merge(x,y,all.x=TRUE,sort=FALSE) ABCDEFGHIJK mergekey 1 2 2 2 2 2 1 2 1 1 2 3 5 2 2 2 1 1 1 1 2 2 1 1 3 3 3 2 1 2 2 1 1 2 1 2 2 1 3 4 2 2 1 2 2 1 2 2 2 2 3 2 5 1 1 2 2 2 2 2 1 2 2 1 4 6 2 1 1 1 2 2 2 2 1 2 3 6 7 1 1 1 1 2 2 2 2 1 2 1 5 8 2 1 2 2 1 1 2 2 1 1 2 4 9 2 2 2 1 1 1 2 1 2 2 2 4 10 2 1 2 2 1 1 2 1 1 1 2 2 11 2 1 2 1 1 1 2 1 2 2 1 4 12 2 2 1 2 1 2 2 1 2 1 3 5 13 2 1 2 1 1 1 2 1 2 2 2 3 14 2 1 2 1 1 1 2 1 2 2 2 3 15 2 2 2 1 2 1 2 1 2 2 2 1 16 2 1 1 2 1 1 2 2 2 2 2 1 17 2 1 1 1 1 1 2 1 1 2 1 2 18 1 2 1 1 1 2 2 1 1 1 1 5 19 2 1 2 1 1 1 2 1 1 1 1 4 20 2 2 1 2 1 1 1 2 1 2 NA NA
Now it seems that "most of x is unsorted," but incomparable ones are brought to the end, rather than maintaining their order.
So my question is: how do I get incomparable to stay put?
PS: Doesn't it seem a little unintuitive to push the disparate to the end if the merger was told not to sort? I do not consider it congruent with this behavior either