Replacing column values in a data frame not included in the list

Question

Replacing column values in a data frame not included in the list

I have data.frame in R, like this:

fruits X1 X2 X3 aa kiwi 15 ba orange 25 cc lemon 23 ba apple 17 cc lemon 19 cc orange 18 cc orange 21 ba banana 17

I would like to replace all the values in column X2 except “orange” and “lemon” with “different”. How to do it in R?

Sample data:

 fruits <- structure(list(X1 = structure(c(1L, 2L, 3L, 2L, 3L, 3L, 3L, 2L ), .Label = c("aa", "ba", "cc"), class = "factor"), X2 = structure(c(3L, 5L, 4L, 1L, 4L, 5L, 5L, 2L), .Label = c("apple", "banana", "kiwi", "lemon", "orange"), class = "factor"), X3 = c(15L, 25L, 23L, 17L, 19L, 18L, 21L, 17L)), .Names = c("X1", "X2", "X3"), class = "data.frame", row.names = c(NA, -8L))

+2

r

Nikita Barsukov Oct 19 '11 at 9:02

source share

3 answers

An easy way is to force the factor to a symbol, and then determine which elements are not in the required classes and replace them with "other" , and finally, back to the coefficient.

There are two variations on this topic: the first using the replace() function:

 transform(fruits, X2 = factor(replace(as.character(X2), list = !X2 %in% c("orange","lemon"), values = "other")))

which gives:

 > transform(fruits, X2 = factor(replace(as.character(X2), + list = !X2 %in% c("orange","lemon"), + values = "other"))) X1 X2 X3 1 aa other 15 2 ba orange 25 3 cc lemon 23 4 ba other 17 5 cc lemon 19 6 cc orange 18 7 cc orange 21 8 ba other 17

Or you can do it manually:

 fruits <- transform(fruits, X2 = {x <- as.character(X2) x[!x %in% c("orange","lemon")] <- "other" factor(x)}) > fruits X1 X2 X3 1 aa other 15 2 ba orange 25 3 cc lemon 23 4 ba other 17 5 cc lemon 19 6 cc orange 18 7 cc orange 21 8 ba other 17

I use transform() here so that we perform manipulations inside an environment where X2 is visible without using things like fruits$X2 , which becomes tedious for input.

+2

Gavin simpson Oct 19 '11 at 9:26

source share

What about:

 R> fruits = data.frame(X1 = 1:3, X2 = c("kiwi", "orange", "lemon")) R> fruits$X2 = as.character(fruits$X2) R> fruits[!(fruits$X2 %in% c("lemon", "orange")),]$X2 = "Other" R> fruits X1 X2 1 1 Other 2 2 orange 3 3 lemon

In the above solution, I converted the factors to “symbols”. You do not have to do this, you can also:

When you create a data frame, use the argument lines AsFactors = FALSE
If you are using read.csv , use stringsAsFactors

You work directly with factors:

 R> fruits$X2 = factor(fruits$X2, levels = c(as.character(fruits$X2), "Other")) R> fruits[!(fruits$X2 %in% c("lemon", "orange")),]$X2 = "Other" R> fruits X1 X2 1 1 Other 2 2 orange 3 3 lemon

Notice that I am expanding the levels of the first factor on line 1.

+1

csgillespie Oct 19 '11 at 9:14

source share

Nick sabbe · Accepted Answer · 2011-10-19T09:14:47+0000

First, create a variable indicating the rows you want to change. You can do this, for example. eg:

 shouldBecomeOther<-!(fruits$X2 %in% c("orange", "lemon"))

Then use this index:

 fruits$X2[shouldBecomeOther]<- "other"

Note that if the column is a factor (very likely), some more work will be required, for example:

 tmp<-as.character(fruits$x2) tmp[shouldBecomeOther]<-"other" fruits$x2<-factor(tmp)

Replacing column values ​​in a data frame not included in the list

More articles:

Replacing column values in a data frame not included in the list