Sine curve using lm and nls in R

I'm new to curve customization, and a few posts on Stackoverflow really helped me.

I tried to fit the sine curve to my data using lm and nls , but both methods show a strange fit, as shown below. Can anyone please indicate where I was wrong. I would suspect something related to time, but could not understand. Access to my data can be obtained from here . plot

 data <- read.table(file="900days.txt", header=TRUE, sep="") time<-data$time temperature<-data$temperature #lm fitting xc<-cos(2*pi*time/366) xs<-sin(2*pi*time/366) fit.lm<-lm(temperature~xc+xs) summary(fit.lm) plot(temp~time, data=data, xlim=c(1, 900)) par(new=TRUE) plot(fit.lm$fitted, type="l", col="red", xlim=c(1, 900), pch=19, ann=FALSE, xaxt="n", yaxt="n") #nls fitting fit.nls<-nls(temp~C+alpha*sin(W*time+phi), start=list(C=27.63415, alpha=27.886, W=0.0652, phi=14.9286)) summary(fit.nls) plot(fit.nls$fitted, type="l", col="red", xlim=c(1, 900), pch=19, ann=FALSE, xaxt="n", axt="n") 
+7
r curve-fitting
source share
4 answers

This is due to the fact that the NA values ​​are removed from the data that needs to be placed (and your data has many of them); therefore, when constructing fit.lm$fitted the plot method interprets the index of this series as the "x" values ​​for its construction.

Try it [notice how I changed the variable names to prevent conflicts with the time and data functions (read this post)]:

 Data <- read.table(file="900days.txt", header=TRUE, sep="") Time <- Data$time temperature <- Data$temperature xc<-cos(2*pi*Time/366) xs<-sin(2*pi*Time/366) fit.lm <- lm(temperature~xc+xs) # access the fitted series (for plotting) fit <- fitted(fit.lm) # find predictions for original time series pred <- predict(fit.lm, newdata=data.frame(Time=Time)) plot(temperature ~ Time, data= Data, xlim=c(1, 900)) lines(fit, col="red") lines(Time, pred, col="blue") 

This gives me:

enter image description here

Most likely this is what you hoped for.

+9
source share

How about choosing X and Y when doing a line graph instead of just choosing Y.

 plot(time,predict(fit.nls),type="l", col="red", xlim=c(1, 900), pch=19, ann=FALSE, xaxt="n", yaxt="n") 

Also both lm and nls just give you set points. Therefore, you must evaluate the rest of the points to make a curve, a line graph. Since you are using nls and lm , perhaps the predict function might be useful.

+4
source share

Not sure if this can help - I get a similar fit using only the sine:

 y = amplitude * sin(pi * (x - center) / width) + Offset amplitude = 2.0009690806953033E+00 center = -2.5813588834888215E+01 width = 1.8077550471975817E+02 Offset = 2.6872265116104828E+01 Fitting target of lowest sum of squared absolute error = 3.6755174406241423E+01 Degrees of freedom (error): 90 Degrees of freedom (regression): 3 Chi-squared: 36.7551744062 R-squared: 0.816419142696 R-squared adjusted: 0.810299780786 Model F-statistic: 133.415731033 Model F-statistic p-value: 1.11022302463e-16 Model log-likelihood: -89.2464811027 AIC: 1.98396768304 BIC: 2.09219299292 Root Mean Squared Error (RMSE): 0.625309918107 amplitude = 2.0009690806953033E+00 std err squared: 1.03828E-02 t-stat: 1.96374E+01 p-stat: 0.00000E+00 95% confidence intervals: [1.79853E+00, 2.20340E+00] center = -2.5813588834888215E+01 std err squared: 2.98349E+01 t-stat: -4.72592E+00 p-stat: 8.41245E-06 95% confidence intervals: [-3.66651E+01, -1.49621E+01] width = 1.8077550471975817E+02 std err squared: 3.54835E+00 t-stat: 9.59680E+01 p-stat: 0.00000E+00 95% confidence intervals: [1.77033E+02, 1.84518E+02] Offset = 2.6872265116104828E+01 std err squared: 5.15458E-03 t-stat: 3.74289E+02 p-stat: 0.00000E+00 95% confidence intervals: [2.67296E+01, 2.70149E+01] Coefficient Covariance Matrix [ 0.02542366 0.01786683 -0.05016085 -0.00652111] [ 1.78668314e-02 7.30548346e+01 -2.18160818e+01 1.24965136e-01] [ -5.01608451e-02 -2.18160818e+01 8.68860810e+00 -1.27401806e-02] [-0.00652111 0.12496514 -0.01274018 0.0126217 ] 

James Phillips zunzun@zunzun.com

+1
source share

Alternatively, you could remove NA from your data after reading:

 data <- subset(data, !is.na(temperature)) 

Then, when plotting, you can set the x axis to time points from the given data set:

 plot(temp~time, data=data, xlim=c(1, 900)) lines(x=time, y=fit.lm$fitted, col="red") 

This curve will not be as smooth as the one created by @ andy-barbour, but it will work as a last resort.

0
source share

All Articles