How to get R to use the specified factor level as a reference in regression?

How can I tell R to use a certain level as a reference if I use binary explanatory variables in regression?

It just uses some level by default.

lm(x ~ y + as.factor(b)) 

with b {0, 1, 2, 3, 4} . Let's say I want to use 3 instead of zero, which is used by R.

+68
r regression linear-regression categorical-data dummy-variable
Oct 06 2018-10-06
source share
5 answers

See the relevel() function. Here is an example:

 set.seed(123) x <- rnorm(100) DF <- data.frame(x = x, y = 4 + (1.5*x) + rnorm(100, sd = 2), b = gl(5, 20)) head(DF) str(DF) m1 <- lm(y ~ x + b, data = DF) summary(m1) 

Now change the coefficient b in DF using the relevel() function:

 DF <- within(DF, b <- relevel(b, ref = 3)) m2 <- lm(y ~ x + b, data = DF) summary(m2) 

Models rated different reference levels.

 > coef(m1) (Intercept) x b2 b3 b4 b5 3.2903239 1.4358520 0.6296896 0.3698343 1.0357633 0.4666219 > coef(m2) (Intercept) x b1 b2 b4 b5 3.66015826 1.43585196 -0.36983433 0.25985529 0.66592898 0.09678759 
+99
Oct 06 '10 at 12:05
source share

Others mentioned the release command, which is the best solution if you want to change the baseline for all analyzes of your data (or are willing to live with data changes).

If you do not want to change the data (this is a one-time change, but in the future you want the default behavior again), then you can use a combination of the C function (mark the uppercase letter) to set contrasts and contr.treatments work with the basic argument for choosing the level on which you want to be the base. For example:

 lm( Sepal.Width ~ C(Species,contr.treatment(3, base=2)), data=iris ) 
+27
Oct 06 2018-10-10
source share

The release () command is an abbreviated method for your question. What he does is reorder the factor so that the ref level is first. Therefore, reordering your factor levels will also have the same effect, but give you more control. Perhaps you wanted to have levels 3,4,0,1,2. In this case...

 bFactor <- factor(b, levels = c(3,4,0,1,2)) 

I prefer this method because it is easier for me to see in my code not only what was the link, but also the position of other values ​​(instead of looking at the results for this).

NOTE: DO NOT make this an ordered factor. A factor with a specified order and an ordered factor is not the same thing. lm () may start to think that you want polynomial contrasts if you do.

+23
06 Oct 2018-10-06
source share

You can also manually mark the column with the contrasts attribute, which seems to be respected by the regression functions:

 contrasts(df$factorcol) <- contr.treatment(levels(df$factorcol), base=which(levels(df$factorcol) == 'RefLevel')) 
+11
Oct 06 2018-10-10
source share

I know this is an old question, but I had a similar problem, and I found that:

 lm(x ~ y + relevel(b, ref = "3")) 

does exactly what you requested.

+1
Dec 14 '17 at 14:27
source share



All Articles