Combine several categorical variables into one dummy variable

I have 3 categorical variables

agegroup{<20,20-30,>03} disease.level{0,1,2}, performance{<60, >=60} 

and I would like to combine them into one dummy variable with 3x3x2 levels. Is there a quick way to do this? My source datasets contain about 10 variables with several levels in each.

Basically I ask for the exact opposite of this question Create new columns of the dummy variable from the categorical variable

Thanks a lot EC

+5
r
source share
1 answer

I'm not sure if the โ€œdummy variableโ€ indicator variables 0/1 are needed (in which you would have 18 dummy variables) or if you need one factor with 18 levels. It seems like the last. (In fact, paste will work just like interaction , although interaction bit more self-describing.)

 > ff <- expand.grid(agegroup=factor(c("<20","20-30",">30")), disease.level=factor(0:2),performance=factor(c("<60",">=60"))) > combfac <- with(ff,interaction(agegroup,disease.level,performance)) > combfac [1] <20.0.<60 20-30.0.<60 >30.0.<60 <20.1.<60 20-30.1.<60 [6] >30.1.<60 <20.2.<60 20-30.2.<60 >30.2.<60 <20.0.>=60 [11] 20-30.0.>=60 >30.0.>=60 <20.1.>=60 20-30.1.>=60 >30.1.>=60 [16] <20.2.>=60 20-30.2.>=60 >30.2.>=60 18 Levels: <20.0.<60 20-30.0.<60 >30.0.<60 <20.1.<60 20-30.1.<60 ... >30.2.>=60 

If you want to use all the variables in the data frame to create an interaction, you can use do.call(interaction,ff) .

If you need dummy variables, you would do model.matrix(~combfac-1) to get them.

+5
source share

All Articles