R merges with itself

Question

R merges with itself

Is it possible to combine data, for example

name,#797,"Stachy, Poland" at_rank,#797,1 to_center,#797,4.70 predicted,#797,4.70

Are the column names indicated according to the second column and the first column?

  name at_rank to_center predicted #797 "Stachy, Poland" 1 4.70 4.70

On request, the entire data set: http://sprunge.us/cYSJ

+4

r data-manipulation

Reactormonk Dec 05 '12 at 18:02

source share

3 answers

Check out the reshape package from Hadley. If I understand correctly, you simply rotate your data from long to wide.

+1

Btibert3 Dec 05 '12 at 18:20

source share

I think that in this case, all you really need to do is transpose, drop in data.frame, set the columns to the first row, and then delete the first row. Perhaps you can skip the last step with some combination of arguments in data.frame, but I don't know what they are doing now.

0

frankc Dec 05 '12 at 18:27

source share

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2012-12-05T18:35:01+0000

The first problem of reading data in should not be a problem if your comma lines are quoted (what they seem to be). Using read.csv with the argument header=FALSE does the trick with the data you provide. (Of course, if there were headers in the data file, remove this argument.)

From there you have several options. Here are two.

reshape (base R) is great for this:

 myDF <- read.csv("http://sprunge.us/cYSJ", header=FALSE) myDF2 <- reshape(myDF, direction="wide", idvar="V2", timevar="V1") head(myDF2) # V2 V3.name V3.at_rank V3.to_center V3.predicted # 1 #1 Kitoman 1 2.41 2.41 # 5 #2 Hosaena 2 4.23 9.25 # 9 #3 Vinzelles, Puy-de-Dôme 1 5.20 5.20 # 13 #4 Whitelee Wind Farm 6 3.29 8.07 # 17 #5 Steveville, Alberta 1 9.59 9.59 # 21 #6 Rocher, Ardèche 1 0.13 0.13

The reshape2 package reshape2 also useful in these cases. It has a simpler syntax, and the output is also a bit cleaner (at least in terms of variable names).

 library(reshape2) myDFw_2 <- dcast(myDF, V2 ~ V1) # Using V3 as value column: use value.var to override. head(myDFw_2) # V2 at_rank name predicted to_center # 1 #1 1 Kitoman 2.41 2.41 # 2 #10 4 Icaraí de Minas 6.07 8.19 # 3 #100 2 Scranton High School (Pennsylvania) 5.78 7.63 # 4 #1000 1 Bat & Ball Inn, Clanfield 2.17 2.17 # 5 #10000 3 Tăuteu 1.87 5.87 # 6 #10001 1 Oak Grove, Northumberland County, Virginia 5.84 5.84

R merges with itself

More articles: