Selecting specific columns when using mutate_each from dplyr

I have a data frame with the first column as a categorical identifier, the second column as a frequency value, and the remaining columns as raw data. I want to multiply all count columns by frequency columns, but not the first two.

All raw count columns start with an uppercase letter followed by a full stop, such as "L.abd", T.xyz, etc.

For example, if I use the code:

require(dplyr) ID <- c(1,2,3,4,5,6) Freq <- c(0.1,0.2,0.3,0.5,0.1,0.3) L.abc <- c(1,1,1,3,1,0) L.ABC <- c(0,3,2,4,1,1) T.xyz <- c(1,1,1,1,0,1) F.ABC <- c(4,5,6,5,3,1) df <- as.data.frame(cbind(ID, Freq, L.abc, L.ABC, T.xyz, F.ABC)) df_new <- df %>% mutate_each(funs(.*Freq), starts_with("L.")) 

I can create a new data frame containing columns of categorical data, along with those columns starting with "L." which were multiplied by the corresponding frequency value.

Is there a way to change the start_with command to select all columns starting with a capital letter and stop completely? My attempts to use modifications such as "[AZ]". were unsuccessful.

Thank you in advance

+5
source share
2 answers

For these cases, matches would be more appropriate.

  df %>% mutate_each(funs(.*Freq), matches("^[AZ]\\.", ignore.case=FALSE)) 

Here, I assume that you only wanted to select column names starting with a capital letter ( ^[AZ] ) followed by . . We need to avoid . ( \\. ), otherwise it will be considered as any separate character.

I do not change anything except the starts_with part. In mutate_each , if we need to pass a function, it can be passed inside a funs call. In the above code, we multiply each of the columns ( . ) Selected by matches "Freq" column.

According to ?select

'(x, ignore.case = TRUE): selects all variables, name matches the regular expression "x

EDIT: added comment by @docendodiscimus

+7
source

I just answered a related question from another user, mutate_each will be deprecated in favor of mutate_at .

In your case, the equivalent code is:

df %>% mutate_at(.cols=vars(matches("^[AZ]\\.", ignore.case=FALSE)), .funs=funs(.*Freq))

ID Freq L.abc L.ABC T.xyz F.ABC 1 1 0.1 0.1 0.0 0.1 0.4 2 2 0.2 0.2 0.6 0.2 1.0 3 3 0.3 0.3 0.6 0.3 1.8 4 4 0.5 1.5 2.0 0.5 2.5 5 5 0.1 0.1 0.1 0.0 0.3 6 6 0.3 0.0 0.3 0.3 0.3

+2
source

All Articles