How to check if a column contains only identical elements in R?

Question

How to check if a column contains only identical elements in R?

Sample data:

x <- matrix(c("Stack","Stack","Stack", "Overflow","Overflow","wolfrevO"), nrow=3,ncol=2)

How to check if x[,1] contains completely identical elements?

If x contains NA s, does this method apply?

thanks

+5

r

Unstack Aug 12 '15 at 14:49

source share

6 answers

You can compare the first value of the vector with the rest of the vector.

 all(x[-1, 1] == x[1, 1]) # [1] TRUE

If NA values are present, then this exact method is not applicable. However, it can be easily fixed using na.omit() . For instance -

 ## create a vector with an NA value x2 <- c(x[, 1], NA) ## standard check returns NA all(x2 == x2[1]) # [1] NA ## call na.omit() to remove, then compare all(na.omit(x2) == x2[1]) # [1] TRUE

So, with your matrix x this last row will become

 all(na.omit(x[-1, 1]) == x[1, 1])

+3

Rich scriven Aug 12 '15 at 15:00

source share

You can use the duplicated function to do this:

if sum(!duplicated(x[,1]))==1 returns TRUE , the column contains all the same values.

 sum(!duplicated(x[,1]))==1 [1] TRUE sum(!duplicated(x[,2]))==1 [1] FALSE

If x contains NA, this method will work in the sense that all NA columns return TRUE , and mixed columns return FALSE .

 x <- matrix(c(NA,NA,NA,"Overflow","Overflow",NA),nrow=3,ncol=2) sum(!duplicated(x[,2]))==1 [1] FALSE sum(!duplicated(x[,1]))==1 [1] TRUE

+2

bjoseph Aug 12 '15 at 14:58

source share

You count unique elements of a column:

 length(unique(x[,1]))==1

works even if your data has NA.

To verify the use of each column:

 apply(x, 2, function(a) length(unique(a))==1)

+2

Markusn Aug 12 '15 at 15:17

source share

I agree with @Richard Scriven for symbols, factors, etc. ( all(x[-1, 1] == x[1, 1]) ).

However, a more robust approach may be useful for comparing numerical values:

 all.same <- function (x) { abs(max(x) - min(x)) < 8.881784e-16 # the constant above is just .Machine$double.eps*4 } apply(x, 2, all.same)

0

rbatt Aug 12 '15 at 16:40

source share

Comparison of the proposed methods:

 x <- rep(1, 1000) x[5] <- 0 microbenchmark::microbenchmark( all(duplicated(x)), length(unique(x)) == 1, dim(table(x)) == 1, all(x == x[1]), times = 1000) Unit: microseconds expr min lq mean median uq max neval cld all(duplicated(x)) 19.594 21.461 24.688356 22.861 24.727 74.646 1000 b length(unique(x)) == 1 21.461 23.793 26.972993 25.193 26.127 156.755 1000 b dim(table(x)) == 1 1067.422 1090.282 1144.309131 1123.872 1154.197 2072.795 1000 c all(x == x[1]) 3.267 4.199 4.629929 4.200 4.666 22.394 1000 a

x is a column or row. Matrix , data.frame or the like, to check the correspondence of rows or columns can be done using

 all(apply(X, 1, function(x){all(x == x[1])}))

0

Davor Josipovic Jun 06 '17 at 16:51

source share

Bryan · Accepted Answer · 2015-08-12T15:01:45+0000

If you want to see which elements are duplicated and how many times you can use table .

 table(x[,1]) # Stack # 3 table(x[,2]) # Overflow wolfrevO # 2 1

To find out if there is only one unique value in a column, use dim .

 dim(table(x[,1])) == 1 # [1] TRUE

How to check if a column contains only identical elements in R?

Comparison of the proposed methods:

More articles: