How to search for columns with the same name, add column values โ€‹โ€‹and replace these columns with the same name by their sum? Using R

I have a data frame where some consecutive columns have the same name. I need to search for them, add their values โ€‹โ€‹for each row, drop one column and replace the other with their sum. without first knowing which patterns are duplicated, it is possible to compare one column name with the next to see if there is a match.

Can anyone help?

Thanks in advance.

+5
source share
4 answers
> dfrm <- data.frame(a = 1:10, b= 1:10, cc= 1:10, dd=1:10, ee=1:10)
> names(dfrm) <- c("a", "a", "b", "b", "b")
> sapply(unique(names(dfrm)[duplicated(names(dfrm))]), 
      function(x) Reduce("+", dfrm[ , grep(x, names(dfrm))]) )
       a  b
 [1,]  2  3
 [2,]  4  6
 [3,]  6  9
 [4,]  8 12
 [5,] 10 15
 [6,] 12 18
 [7,] 14 21
 [8,] 16 24
 [9,] 18 27
[10,] 20 30

EDIT 2: Using rowSums allows you to simplify the first argument sapply just for unique(names(dfrm))the expense of having to remember to include drop = FALSE in "[":

sapply(unique(names(dfrm)), 
       function(x) rowSums( dfrm[ , grep(x, names(dfrm)), drop=FALSE]) )

NA:

sapply(unique(names(dfrm)), 
      function(x) apply(dfrm[grep(x, names(dfrm))], 1, 
              function(y) if ( all(is.na(y)) ) {NA} else { sum(y, na.rm=TRUE) }
       )               )

( : Tommy, (.) [.]. :

sapply(names(dfrm)[unique(duplicated(names(dfrm)))], 
     function(x) Reduce("+", dfrm[ , grep(x, names(dfrm))]) )
+7

# transpose data frame, sum by group = rowname, transpose back.
t(rowsum(t(dfrm), group = rownames(t(dfrm))))
+4

.

dfr <- data.frame(
  foo = rnorm(20),
  bar = 1:20,
  bar = runif(20),
  check.names = FALSE
)

: ; , nme , , . rowSums . ( Duh. EDIT: "duh", !) lapply , , , , . EDIT: sapply .

unique_col_names <- unique(colnames(dfr))
new_dfr <- sapply(unique_col_names, function(name)
{
  subs <- dfr[, colnames(dfr) == name]
  if(is.data.frame(subs))
    rowSums(subs)
  else
    subs
})
+2
source

One way is to identify duplicates using a (unexpectedly) function duplicated, and then scroll through them to calculate the amounts. Here is an example:

dat.dup <- data.frame(x=1:10, x=1:10, x=1:10, y=1:10, y=1:10, z=1:10, check.names=FALSE)
dups <- unique(names(dat.dup)[duplicated(names(dat.dup))])
for (i in dups) {
dat.dup[[i]] <- rowSums(dat.dup[names(dat.dup) == i])
}
dat <- dat.dup[!duplicated(names(dat.dup))]
+1
source

All Articles