Removing columns with the same value from a data frame

I have a data frame similar to this

1 1 1 K 1 KK 2 1 2 K 1 KK 3 8 3 K 1 KK 4 8 2 K 1 KK 1 1 1 K 1 KK 2 1 2 K 1 KK 

I want to delete all columns with the same value, for example, K, so my result would be like this:

 1 1 1 1 2 1 2 1 3 8 3 1 4 8 2 1 1 1 1 1 2 1 2 1 

I am trying to iterate through columns, but I am not getting anything. Any ideas? thanks in advance

+7
source share
4 answers

To select columns with multiple values, regardless of type:

 uniquelength <- sapply(d,function(x) length(unique(x))) d <- subset(d, select=uniquelength>1) 

?

(Oh, the Roman question is right - it can also knock out your column 5)

May ( change : thanks to comments!)

 isfac <- sapply(d,inherits,"factor") d <- subset(d,select=!isfac | uniquelength>1) 

or

 d <- d[,!isfac | uniquelength>1] 
+4
source

Here is a solution that will work to remove any replicated columns (including, for example, pairs of replicated characters, numeric or factor columns). This is how I read the OP question, and even if this is a misinterpretation, it seems like an interesting question.

 df <- read.table(text=" 1 1 1 K 1 KK 2 1 2 K 1 KK 3 8 3 K 1 KK 4 8 2 K 1 KK 1 1 1 K 1 KK 2 1 2 K 1 KK") # Need to run duplicated() in 'both directions', since it considers # the first example to be **not** a duplicate. repdCols <- as.logical(duplicated(as.list(df), fromLast=FALSE) + duplicated(as.list(df), fromLast=TRUE)) # [1] FALSE FALSE FALSE TRUE FALSE TRUE TRUE df[!repdCols] # V1 V2 V3 V5 # 1 1 1 1 1 # 2 2 1 2 1 # 3 3 8 3 1 # 4 4 8 2 1 # 5 1 1 1 1 # 6 2 1 2 1 
+3
source

Oneliner Solution.

 df2 <- df[sapply(df, function(x) !is.factor(x) | length(unique(x))>1 )] 
+2
source

Another way to do this is to use the higher order Filter function. Here is the code

 to_keep <- function(x) any(is.numeric(x), length(unique(x)) > 1) Filter(to_keep, d) 
+1
source

All Articles