Reorder data - R

I have a dataframe that looks like this:

abcd ab 0 0 1 0 cd -0.415 1.415 0 0 ef 0 0 0.0811 0.918 

Is there an easy way to convert this table to:

  abcd ab 0 0 1 0 cd -0.415 0 0 0 cd 0 1.415 0 0 ef 0 0 0.0811 0 ef 0 0 0 0.918 

If there are two or more numbers in the source table, I want to convert them to the corresponding row numbers. I do not know how to do this, so any help would be appreciated

+4
source share
5 answers

Borrow some of @AnandaMahto and thaw upon your request. Please think: any unique combination that you want to explore follows the left side values ​​for the variable on the right. In this case, the variable names become values.

 library(reshape2) mydf <- structure(list(a = c(0, -0.415, 0), b = c(0, 1.415, 0), c = c(1, 0, 0.0811), d = c(0, 0, 0.918)), .Names = c("a", "b", "c", "d"), class = "data.frame", row.names = c("ab", "cd", "ef")) mydf$rows<- rownames(mydf) m1<- melt(mydf, id="rows", measured= names(mydf)) m2<- dcast(m1, rows+value~..., fill=0) m2<- m2[m2$value!=0, ] m2$value <- NULL #rows abcd #2 ab 0.000 0.000 1.0000 0.000 #3 cd -0.415 0.000 0.0000 0.000 #5 cd 0.000 1.415 0.0000 0.000 #7 ef 0.000 0.000 0.0811 0.000 #8 ef 0.000 0.000 0.0000 0.918 
+4
source

Here is one way using matrix indexing. (Data is converted to a matrix, so it’s better if your data is of the same type, as in your example.)

 reformat.dat <- function(dat) { tdat <- t(dat) nz <- tdat != 0 i <- col(tdat)[nz] j <- row(tdat)[nz] out <- matrix(0, sum(nz), ncol(dat)) out[cbind(seq_len(sum(nz)), j)] <- tdat[nz] rownames(out) <- rownames(dat)[i] colnames(out) <- colnames(dat) out } reformat.dat(dat) # abcd # ab 0.000 0.000 1.0000 0.000 # cd -0.415 0.000 0.0000 0.000 # cd 0.000 1.415 0.0000 0.000 # ef 0.000 0.000 0.0811 0.000 # ef 0.000 0.000 0.0000 0.918 
+4
source

Here's a direct solution using diag :

 o <- apply(df, 1, function(x) { t <- diag(x) colnames(t) <- names(x) t <- t[rowSums(t == 0) != length(x), ,drop = FALSE] t }) ids <- rep(names(o), sapply(o, nrow)) o <- do.call(rbind, o) row.names(o) <- ids # abcd # ab 0.000 0.000 1.0000 0.000 # cd -0.415 0.000 0.0000 0.000 # cd 0.000 1.415 0.0000 0.000 # ef 0.000 0.000 0.0811 0.000 # ef 0.000 0.000 0.0000 0.918 

This is matrix . Use as.data.frame(.) If data.frame is required.

+2
source

Here is one approach, but you will need to keep track of some cosmetic changes to correct line names.

Your data in reproducible form:

 mydf <- structure(list(a = c(0, -0.415, 0), b = c(0, 1.415, 0), c = c(1, 0, 0.0811), d = c(0, 0, 0.918)), .Names = c("a", "b", "c", "d"), class = "data.frame", row.names = c("ab", "cd", "ef")) 

Replace zeros with NA s:

 mydf[mydf == 0] <- NA 

stack your data.frame to make it a "long" data.frame :

 mydf1 <- data.frame(Rows = rownames(mydf), stack(mydf)) 

Create unique values ​​for strings ""

 mydf1$Rows <- make.unique(as.character(mydf1$Rows)) # Let see what we have so far.... mydf1 # Rows values ind # 1 ab NA a # 2 cd -0.4150 a # 3 ef NA a # 4 ab.1 NA b # 5 cd.1 1.4150 b # 6 ef.1 NA b # 7 ab.2 1.0000 c # 8 cd.2 NA c # 9 ef.2 0.0811 c # 10 ab.3 NA d # 11 cd.3 NA d # 12 ef.3 0.9180 d 

Now just use xtabs to get the result you are looking for. Wrap it in as.data.frame.matrix if you want data.frame , and clear the row names if you need to.

 as.data.frame.matrix(xtabs(values ~ Rows + ind, mydf1)) # abcd # ab.2 0.000 0.000 1.0000 0.000 # cd -0.415 0.000 0.0000 0.000 # cd.1 0.000 1.415 0.0000 0.000 # ef.2 0.000 0.000 0.0811 0.000 # ef.3 0.000 0.000 0.0000 0.918 
+1
source

I don’t think there is an elegant version of what you ask for exactly, but maybe you can use melt from reshape2 instead? It will give you one row per couple of rows / columns:

 > library(reshape2) > # add row names as column > df <- cbind(df, names=rownames(df)) > df <- melt(df,id.var="names") Using as id variables > df[df$value != 0,] names variable value 2 cd a -0.4150 5 cd b 1.4150 7 ab c 1.0000 9 ef c 0.0811 12 ef d 0.9180 
-1
source

All Articles