Take this simple data frame of related identifiers:
test <- data.frame(id1=c(10,10,1,1,24,8),id2=c(1,36,24,45,300,11)) > test id1 id2 1 10 1 2 10 36 3 1 24 4 1 45 5 24 300 6 8 11
Now I want to combine all the identifiers that are related. By “link”, I mean by following the chain of links so that all identifiers in the same group are marked together. View of a branching structure. i.e:
Group 1 10 --> 1, 1 --> (24,45) 24 --> 300 300 --> NULL 45 --> NULL 10 --> 36, 36 --> NULL, Final group members: 10,1,24,36,45,300 Group 2 8 --> 11 11 --> NULL Final group members: 8,11
Now I roughly know the logic that I would like, but I do not know how I will implement it elegantly. I am thinking of recursively using match or %in% to go down each branch, but this time really puzzled.
The end result that I would pursue is:
result <- data.frame(group=c(1,1,1,1,1,1,2,2),id=c(10,1,24,36,45,300,8,11)) > result group id 1 1 10 2 1 1 3 1 24 4 1 36 5 1 45 6 1 300 7 2 8 8 2 11