How to set the minimum value for a basic measurement in mgcv?

Question

How to set the minimum value for a basic measurement in mgcv?

Using the fined mgcv spline, I want to get the effective degrees of freedom (EDF) of 10 / year in the sample data (60 for the entire period).

library(mgcv) library(dlnm) df <- chicagoNMMAPS df1<-subset(df, as.Date(date) >= '1995-01-01') mod1 <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1)

In the example data, the base time measured by edf for time is 56.117, which is less than 10 per year.

 summary(mod1) Approximate significance of smooth terms: edf Ref.df F p-value s(time) 56.117 67.187 5.369 <2e-16 *** s(temp) 2.564 3.204 0.998 0.393 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.277 Deviance explained = 28.2% GCV score = 1.1297 Scale est. = 1.0959 n = 2192

I will manually edit edf a by providing smoothing options as follows

 mod1$sp s(time) s(temp) 23.84809 17.23785

Then I plug the sp output into a new model and re-launch it. Basically, I will continue to change sp until I get edf around 60. I will only change the smoothing parameter for time.

I will start with a lower value and check edf:

 mod1a <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1, sp= c(12.84809, 17.23785 )) summary(mod1a) # edf 62.997

I need to increase the smoothing options for time to reduce edf to about 60.

 mod1b <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1, sp= c(14.84809, 17.23785 )) summary(mod1b) edf 61.393 ## EDF still large, thus I have to increase the sp` mod1c <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1, sp=c(16.8190989, 17.23785)) summary(mod1c) edf= 60.005 ## This is what I want to obtain as a final model.

How can this end result be achieved with efficient code?

+7

r gam

Meso Dec 03 '13 at 11:14

source share

2 answers

Why use a fined spline and then change its parameter smoothing to create a fixed regression spline? It makes no sense to me.

A fixed df cubic regression spline with 60 edf is set as follows:

 mod1 <-gam(resp ~ s(time,bs='cr',k=61,fx=TRUE)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1)

What gives the ideal:

 > summary(mod1) Family: quasipoisson Link function: log ... Approximate significance of smooth terms: edf Ref.df F p-value s(time) 60.000 60.000 6.511 <2e-16 *** s(temp) 2.505 3.165 0.930 0.427

If you need a fined spline, then use a fined spline and accept that the basic idea of punishment is that you do not have a fixed edf.

+3

Joris meys Dec 18 '13 at 16:31

source share

nograpes · Accepted Answer · 2013-12-15T15:14:06+0000

I don't understand the details of your model, but if you want to minimize (or maximize) edf for models with different optim , optim will do the job. First, create a function that returns only edf with different sp values.

 edf.by.sp<-function(sp) { model <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow), family=quasipoisson, na.action=na.omit, data=df1, sp= c(sp, 17.23785) # Not sure if this quite right. ) abs(summary(model)$s.table['s(time)','edf']-60) # Subtract 60 and flip sign so 60 is lowest. }

Now you can simply run optim to minimize edf :

 # You could pick any reasonable starting sp value. # Many optimization methods are available, but in your case # they work equally well. best<-optim(12,edf.by.sp,method='BFGS')$par best # 16.82708

and, picking back, you get almost 0 (exactly 60 before the conversion) when connecting the function:

 edf.by.sp(best) # 2.229869e-06

How to set the minimum value for a basic measurement in mgcv?

More articles: