B Spline blending

I understand that there are posts on this board on the subject of B-Splines, but in fact it confused me, so I thought that someone could help me.

I simulated data for x values ​​ranging from 0 to 1. I would like to fit a cubic spline ( degree = 3 ) with nodes in my data at 0, 0,1, 0,2, ..., 0,9, 1 I I would also like to use the B-Spline and OLS databases to evaluate parameters (I'm not looking for fined splines).

It seems to me that I need the bs function from the spline package, but I'm not quite sure, and I also do not know what exactly to feed it.

I would also like to build the resulting polynomial spline.

Thanks!

+6
source share
2 answers
 ## simulate some data - from mgcv::magic set.seed(1) n <- 400 x <- 0:(n-1)/(n-1) f <- 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10 y <- f + rnorm(n, 0, sd = 2) ## load the splines package - comes with R require(splines) 

You use the bs() function in the formula before lm , as you want OLS estimates. bs provides basic functions defined by nodes, degree of polynomial, etc.

 mod <- lm(y ~ bs(x, knots = seq(0.1, 0.9, by = 0.1))) 

You can consider this as a linear model.

 > anova(mod) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) bs(x, knots = seq(0.1, 0.9, by = 0.1)) 12 2997.5 249.792 65.477 < 2.2e-16 *** Residuals 387 1476.4 3.815 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Some pointers to the location of nodes. bs has an argument of Boundary.knots , by default Boundary.knots = range(x) - so when I specified the knots argument above, I did not include boundary nodes.

Read more ?bs .

Plotting a set spline

In the comments, I discuss how to draw an installed spline. One option is to streamline the data in terms of covariance. This is great for one covariate, but you don't need to work on 2 or more covariates. Another problem is that you can only evaluate the installed spline at the observed x values ​​- this is great if you have chosen the covariate tightly, but if not, the spline may look odd with long linear sections.

A more general solution is to use predict to generate predictions from the model for new covariate or covariate values. In the code below, I will show how to do this for the model above, predicting for 100 evenly spaced values ​​in the x range.

 pdat <- data.frame(x = seq(min(x), max(x), length = 100)) ## predict for new `x` pdat <- transform(pdat, yhat = predict(mod, newdata = pdat)) ## now plot ylim <- range(pdat$y, y) ## not needed, but may be if plotting CIs too plot(y ~ x) lines(yhat ~ x, data = pdat, lwd = 2, col = "red") 

It creates

enter image description here

+10
source

Using the example provided in the answer, an easier way to build an inline spline would be to use the effects package.

 ## simulate some data - from mgcv::magic set.seed(1) n <- 400 x <- 0:(n-1)/(n-1) f <- 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10 y <- f + rnorm(n, 0, sd = 2) ## load the splines package - comes with R require(splines) require(car) require(effects) ## estimate model mod <- lm(y ~ bs(x, knots = seq(0.1, 0.9, by = 0.1))) 

Then you can use Anova from car :

 > Anova(mod) Anova Table (Type II tests) Response: y Sum Sq Df F value Pr(>F) bs(x, knots = seq(0.1, 0.9, by = 0.1)) 2997.5 12 65.477 < 2.2e-16 *** Residuals 1476.4 387 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

And you can easily build an inline spline using the effects package.

 plot(allEffects(mod)) 

What will output this:

enter image description here

See also:

+2
source

All Articles