How to find good starting values for the nls function?

Question

How to find good starting values for the nls function?

I do not understand why I do not have the nls function for this data. I tried with many different initial values and I always get the same error.

That's what I'm doing:

expFct2 = function (x, a, b,c) { a*(1-exp(-x/b)) + c } vec_x <- c(77.87,87.76,68.6,66.29) vec_y <- c(1,1,0.8,0.6) dt <- data.frame(vec_x=vec_x,vec_y=vec_y) ggplot(data = dt,aes(x = vec_x, y = vec_y)) + geom_point() + geom_smooth(data=dt, method="nls", formula=y~expFct2(x, a, b, c), se=F, start=list(a=1, b=75, c=-5)

I always have this error:

 Error in method(formula, data = data, weights = weight, ...) : singular gradient

+7

r ggplot2 nls

Tali Mar 13 '12 at 21:45

source share

3 answers

Bringing a three-parameter non-linear model to four data points will in any case be moderately complex, although in this case the data behaves well. Point number 1 is that the initial value for parameter c (-5) was disabled. Drawing an image of the curve corresponding to your initial parameters (see below) will help you understand this (so that you will find out that the curve you received will vary from c to minimum to c+a to the maximum, and your data range is from 0.6 to one...)

However, even with a better starting point, I found myself fussing with control parameters (i.e. control=nls.control(maxiter=200) ), and then more warnings - nls not known for its reliability. So I tried the SSasympOff model, which implements a self-starting version of the curve that you want to fit.

 start1 <- list(a=1, b=75, c=-5) start2 <- list(a=0.5, b=75, c=0.5) ## a better guess pfun <- function(params) { data.frame(vec_x=60:90, vec_y=do.call(expFct2,c(list(x=60:90),params))) } library(ggplot2) ggplot(data = dt,aes(x = vec_x, y = vec_y)) + geom_point() + geom_line(data=pfun(start1))+ geom_line(data=pfun(start2),colour="red")+ geom_smooth(data=dt, method="nls", formula=y~SSasympOff(x, a, b, c), se=FALSE)

My advice in general is that it’s easier to understand what is happening and fix problems if you enter nls outside of geom_smooth and build the curve you want to add using predict.nls ...

More generally, a way to get good initial parameters is to understand the geometry of a suitable function and which parameters determine which aspects of the curve. As I mentioned above, c is the minimum value of the shifted saturating-exponential curve, a is the range, and b is the scale parameter (you can see that for x=b curve is 1-exp(-1) > or about 2/3 of the way from minimum to maximum). A good way to collect this information is either a bit of algebra and calculus (i.e., Acceptance of constraints), or a game with a curve() function.

+9

Ben bolker Mar 13 '12 at 23:16

source share

I'm trying to find an interpretation of your parameters: a is the slope, b is the convergence rate, a + c is the limit, but by itself, it seems to mean little. After reparameterizing your function, the problem disappears.

 f <- function (x, a,b,c) a + c * exp(-x/abs(b)) nls(y~f(x, a, b, c), data=dt, start=list(a=1, b=75, c=-5), trace=TRUE)

However, the value of c looks very, very high: this is probably why the model initially did not converge.

 Nonlinear regression model model: y ~ f(x, a, b, c) data: dt abc 1.006e+00 3.351e+00 -1.589e+08 residual sum-of-squares: 7.909e-05 Number of iterations to convergence: 9 Achieved convergence tolerance: 2.232e-06

Here is another, more reasonable parameterization of the same function.

 g <- function (x, a,b,c) a * (1-exp(-(xc)/abs(b))) nls(y~g(x, a, b, c), data=dt, start=list(a=1, b=75, c=-5), trace=TRUE) Nonlinear regression model model: y ~ g(x, a, b, c) data: dt abc 1.006 3.351 63.257 residual sum-of-squares: 7.909e-05 Number of iterations to convergence: 10 Achieved convergence tolerance: 1.782e-06

+2

Vincent zoonekynd Mar 13 '12 at 23:41

source share

G. grothendieck · Accepted Answer · 2012-03-13T23:43:01+0000

This can be written with two linear parameters ( .lin1 and .lin2 ) and one non-linear parameter ( b ) as follows:

 a*(1-exp(-x/b)) + c = (a+c) - a * exp(-x/b) = .lin1 + .lin2 * exp(-x/b)

where .lin1 = a+c and .lin2 = -a (so a = - .lin2 and c = .lin1 + .lin2 ) This allows the use of "plinear" , which requires only the specification of the initial value for one nonlinear parameter (fixing the problem of how to set the initial values for other parameters) and which converges, despite the fact that the initial value b=75 is far from the initial value of the solution:

 nls(y ~ cbind(1, exp(-x/b)), start = list(b = 75), alg = "plinear")

Here is the result of the run, from which we can see from the .lin2 size that the problem is highly scalable:

 > x <- c(77.87,87.76,68.6,66.29) > y <- c(1,1,0.8,0.6) > nls(y ~ cbind(1, exp(-x/b)), start = list(b = 75), alg = "plinear") Nonlinear regression model model: y ~ cbind(1, exp(-x/b)) data: parent.frame() b .lin1 .lin2 3.351e+00 1.006e+00 -1.589e+08 residual sum-of-squares: 7.909e-05 Number of iterations to convergence: 9 Achieved convergence tolerance: 9.887e-07 > R.version.string [1] "R version 2.14.2 Patched (2012-02-29 r58660)" > win.version() [1] "Windows Vista (build 6002) Service Pack 2"

EDIT: Added sample run and scaling comment.

How to find good starting values ​​for the nls function?

More articles:

How to find good starting values for the nls function?