How to set the minimum value for a basic measurement in mgcv?

Using the fined mgcv spline, I want to get the effective degrees of freedom (EDF) of 10 / year in the sample data (60 for the entire period).

library(mgcv) library(dlnm) df <- chicagoNMMAPS df1<-subset(df, as.Date(date) >= '1995-01-01') mod1 <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1) 

In the example data, the base time measured by edf for time is 56.117, which is less than 10 per year.

 summary(mod1) Approximate significance of smooth terms: edf Ref.df F p-value s(time) 56.117 67.187 5.369 <2e-16 *** s(temp) 2.564 3.204 0.998 0.393 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.277 Deviance explained = 28.2% GCV score = 1.1297 Scale est. = 1.0959 n = 2192 

I will manually edit edf a by providing smoothing options as follows

 mod1$sp s(time) s(temp) 23.84809 17.23785 

Then I plug the sp output into a new model and re-launch it. Basically, I will continue to change sp until I get edf around 60. I will only change the smoothing parameter for time.

I will start with a lower value and check edf:

 mod1a <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1, sp= c(12.84809, 17.23785 )) summary(mod1a) # edf 62.997 

I need to increase the smoothing options for time to reduce edf to about 60.

 mod1b <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1, sp= c(14.84809, 17.23785 )) summary(mod1b) edf 61.393 ## EDF still large, thus I have to increase the sp` mod1c <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1, sp=c(16.8190989, 17.23785)) summary(mod1c) edf= 60.005 ## This is what I want to obtain as a final model. 

How can this end result be achieved with efficient code?

+7
r gam
source share
2 answers

I don't understand the details of your model, but if you want to minimize (or maximize) edf for models with different optim , optim will do the job. First, create a function that returns only edf with different sp values.

 edf.by.sp<-function(sp) { model <-gam(resp ~ s(time,bs='cr',k=6*15, fx=F)+ s(temp,k=6, bs='cr') + as.factor(dow), family=quasipoisson, na.action=na.omit, data=df1, sp= c(sp, 17.23785) # Not sure if this quite right. ) abs(summary(model)$s.table['s(time)','edf']-60) # Subtract 60 and flip sign so 60 is lowest. } 

Now you can simply run optim to minimize edf :

 # You could pick any reasonable starting sp value. # Many optimization methods are available, but in your case # they work equally well. best<-optim(12,edf.by.sp,method='BFGS')$par best # 16.82708 

and, picking back, you get almost 0 (exactly 60 before the conversion) when connecting the function:

 edf.by.sp(best) # 2.229869e-06 
+5
source share

Why use a fined spline and then change its parameter smoothing to create a fixed regression spline? It makes no sense to me.

A fixed df cubic regression spline with 60 edf is set as follows:

 mod1 <-gam(resp ~ s(time,bs='cr',k=61,fx=TRUE)+ s(temp,k=6, bs='cr') + as.factor(dow) ,family=quasipoisson,na.action=na.omit,data=df1) 

What gives the ideal:

 > summary(mod1) Family: quasipoisson Link function: log ... Approximate significance of smooth terms: edf Ref.df F p-value s(time) 60.000 60.000 6.511 <2e-16 *** s(temp) 2.505 3.165 0.930 0.427 

If you need a fined spline, then use a fined spline and accept that the basic idea of ​​punishment is that you do not have a fixed edf.

+3
source share

All Articles