Calculate maximum row size

I would like to create a new column that is equal to the maximum value of all the columns of this row.

Here is an example:

library(data.table) data <- data.table(head(iris)) data[ , Species := NULL] data Sepal.Length Sepal.Width Petal.Length Petal.Width 1: 5.1 3.5 1.4 0.2 2: 4.9 3.0 1.4 0.2 3: 4.7 3.2 1.3 0.2 4: 4.6 3.1 1.5 0.2 5: 5.0 3.6 1.4 0.2 6: 5.4 3.9 1.7 0.4 

I cannot use the max function here because it will find the maximum value for all columns, for example, data[, max_value := max(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)] . I want something like this:

  Sepal.Length Sepal.Width Petal.Length Petal.Width max_value 1: 5.1 3.5 1.4 0.2 5.1 2: 4.9 3.0 1.4 0.2 4.9 3: 4.7 3.2 1.3 0.2 4.7 4: 4.6 3.1 1.5 0.2 4.6 5: 5.0 3.6 1.4 0.2 5.0 6: 5.4 3.9 1.7 0.4 5.4 
+7
r data.table
source share
3 answers

I will not vouch for its speed, but this, at least, allows us to avoid coercion to the matrix:

 data[,mymax:=do.call(pmax,.SD)] 
+7
source share

Using dplyr , you can do:

 library(dplyr) setDF(data) %>% rowwise() %>% mutate(max = max(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)) #Source: local data frame [10 x 5] #Groups: <by row> # # Sepal.Length Sepal.Width Petal.Length Petal.Width max #1 5.1 3.5 1.4 0.2 5.1 #2 4.9 3.0 1.4 0.2 4.9 #3 4.7 3.2 1.3 0.2 4.7 #4 4.6 3.1 1.5 0.2 4.6 #5 5.0 3.6 1.4 0.2 5.0 #6 5.4 3.9 1.7 0.4 5.4 

Or as @akrun suggested:

 setDF(data) %>% mutate(max=pmax(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)) 

This is much faster than the rowwise() approach:

 n <- 10e6; nc <- 4; DT <- data.table(replicate(nc,rnorm(n))) mbm <- microbenchmark( steven = DT %>% rowwise() %>% mutate(V5 = max(V1, V2, V3, V4)), frank = DT[,c(.SD,list(do.call(pmax,.SD)))], akrun = DT %>% mutate(V5 = pmax(V1, V2, V3, V4)), times = 25, unit = "relative") 

enter image description here

 #Unit: relative # expr min lq mean median uq max neval cld # steven 17.93647 18.024734 17.535764 17.42948 17.484920 16.446384 25 b # frank 1.00000 1.000000 1.000000 1.00000 1.000000 1.000000 25 a # akrun 1.00220 1.002281 1.013604 1.00240 1.003089 1.001262 25 a 
+6
source share

use by=1:nrow(data) to group by rows:

 data[, max_value:=max(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width), by=1:nrow(data)] data Sepal.Length Sepal.Width Petal.Length Petal.Width max_value 1: 5.1 3.5 1.4 0.2 5.1 2: 4.9 3.0 1.4 0.2 4.9 3: 4.7 3.2 1.3 0.2 4.7 4: 4.6 3.1 1.5 0.2 4.6 5: 5.0 3.6 1.4 0.2 5.0 6: 5.4 3.9 1.7 0.4 5.4 
+3
source share

All Articles