I have a network table saved as a csv file (data frame) looking like this:
a b 1
b a 3
a c 2
a d 2
c a 2
I want to keep a duplicate pair of values, in this case
a b 1
b a 3
should be saved as follows:
a b
a c
Other values ββshould be omitted. How can I achieve this in R? Thanks in advance!
updated: My file is also very large (about 100 MB, maybe 70 thousand lines), so I need a solution that can work quickly. At first I try to sort and then check for duplicate, but it is too slow.
Here is my code:
ud <- function(df){
df[1:2] <- t( apply(df[1:2], 1, sort) )
out <- df[duplicated(df[1:2]),]
out[3] <- NULL
write.table(out, file="D:/out.txt", sep=" ", row.names=FALSE, col.names=FALSE)
}
source
share