When I try to get my data for analysis, I can not do it right. Suppose I have data in this form:
df1 V1 V2df1 a H b Y c Y df2 V1 V2df2 a Y j H b Y
and three more (a total of 5 data sets of various lengths). I am trying to do the following. First I have to find all the common elements from the first column (V1) - in this case it is: a, b. Then, in accordance with these common elements, I try to create a unified data set, where V1 values ββwill be distributed for all five data sets, and values ββfrom other columns will be added to one row. Therefore, to explain the example, my result should look something like this:
V1 V2df1 V2df2 a HY b YY
I managed to get some code, but unfortunately the results are incorrect. What I did: read all the lines from all files into variables (example: a<-df1[,1] , etc.) and find the general lines, for example:
red<-Reduce(intersect, list(a,b,c,d,e))
then I filtered out specific data sets, for example:
df1 <- unique(filter(df1, V1 %in% red))
I ordered each dataset line by line:
df1<-data.frame(df1[with(df1, order(V1)),])
and deleted duplicates (items in the first column):
df1<- df1[unique(df1$V1),]
Then I created a new dataset:
newdata<-data.frame(V1common=df1[,1], V2df1=df1[,2],V2df2=df2[,2]...)
... means for all five datasets. Actually, I have the same number of rows (a good sign, since there is the same number of rows at the intersection), and then other sorted columns are added, but something doesn't add up. Thanks for any advice. (I missed using libraries, etc., code for illustrative purposes).