A bit slower than suggested above, but I think it should handle the case of equal adjacent groups
findBreaks <- function(x) cumsum(rle(x)$lengths) constantGroups <- function(d, groupColIndex=1) { d <- d[order(d[, groupColIndex]), ] breaks <- lapply(d, findBreaks) groupBreaks <- breaks[[groupColIndex]] numBreaks <- length(groupBreaks) isSubset <- function(x) length(x) <= numBreaks && length(setdiff(x, groupBreaks)) == 0 unlist(lapply(breaks[-groupColIndex], isSubset)) }
The intuition is that if a column is constant in a group, then gaps in the column values ββ(sorted by group value) will be a subset of gaps in the group value.
Now compare it to hadley (with a little modification to determine n)
# df defined as in the question n <- nrow(df) changed <- function(x) c(TRUE, x[-1] != x[-n]) constant_cols2 <- function(df,grp){ df <- df[order(df[,grp]),] changes <- lapply(df, changed) vapply(changes[-1], identical, changes[[1]], FUN.VALUE = logical(1)) } > system.time(constant_cols2(df, 1)) user system elapsed 1.779 0.075 1.869 > system.time(constantGroups(df)) user system elapsed 2.503 0.126 2.614 > df$f <- 1 > constant_cols2(df, 1) abcdf TRUE TRUE FALSE FALSE FALSE > constantGroups(df) abcdf TRUE TRUE FALSE FALSE TRUE
David f
source share