Removing columns with the same value from a data frame

Question

Removing columns with the same value from a data frame

I have a data frame similar to this

1 1 1 K 1 KK 2 1 2 K 1 KK 3 8 3 K 1 KK 4 8 2 K 1 KK 1 1 1 K 1 KK 2 1 2 K 1 KK

I want to delete all columns with the same value, for example, K, so my result would be like this:

 1 1 1 1 2 1 2 1 3 8 3 1 4 8 2 1 1 1 1 1 2 1 2 1

I am trying to iterate through columns, but I am not getting anything. Any ideas? thanks in advance

+7

r dataframe unique-values

user976991 Dec 05 '11 at 16:26

source share

4 answers

Here is a solution that will work to remove any replicated columns (including, for example, pairs of replicated characters, numeric or factor columns). This is how I read the OP question, and even if this is a misinterpretation, it seems like an interesting question.

 df <- read.table(text=" 1 1 1 K 1 KK 2 1 2 K 1 KK 3 8 3 K 1 KK 4 8 2 K 1 KK 1 1 1 K 1 KK 2 1 2 K 1 KK") # Need to run duplicated() in 'both directions', since it considers # the first example to be **not** a duplicate. repdCols <- as.logical(duplicated(as.list(df), fromLast=FALSE) + duplicated(as.list(df), fromLast=TRUE)) # [1] FALSE FALSE FALSE TRUE FALSE TRUE TRUE df[!repdCols] # V1 V2 V3 V5 # 1 1 1 1 1 # 2 2 1 2 1 # 3 3 8 3 1 # 4 4 8 2 1 # 5 1 1 1 1 # 6 2 1 2 1

+3

Josh o'brien Dec 05 '11 at 18:31

source share

Oneliner Solution.

 df2 <- df[sapply(df, function(x) !is.factor(x) | length(unique(x))>1 )]

+2

Wojciech sobala Dec 05 '11 at 21:48

source share

Another way to do this is to use the higher order Filter function. Here is the code

 to_keep <- function(x) any(is.numeric(x), length(unique(x)) > 1) Filter(to_keep, d)

+1

Ramnath Dec 05 '11 at 18:18

source share

Ben bolker · Accepted Answer · 2011-12-05T16:33:19+0000

To select columns with multiple values, regardless of type:

 uniquelength <- sapply(d,function(x) length(unique(x))) d <- subset(d, select=uniquelength>1)

?

(Oh, the Roman question is right - it can also knock out your column 5)

May ( change : thanks to comments!)

 isfac <- sapply(d,inherits,"factor") d <- subset(d,select=!isfac | uniquelength>1)

or

 d <- d[,!isfac | uniquelength>1]

Removing columns with the same value from a data frame

More articles: