I have no elegant answer, but I have an answer.
(SEE THE EDITION BELOW THE ELEGANT RESPONSE)
If Cx is small enough so that there are no data points to match A and Cy to, or if Cx is large enough so that there are no data points corresponding to B and Cy QR decomposition matrix will be singular because there will be many different values ββof Cx , A and Cy or Cx , B and Cy respectively, which will equally well match the data.
I tested this by preventing the installation of Cx . If I fix Cx in (say) Cx = mean(x) , nls() solves the problem without difficulty:
nls(y ~ ifelse(x < mean(x),ya+A*x,yb+B*x), data = data.frame(x,y), start = c(A=-1000,B=-1000,ya=3,yb=0))
... gives:
Nonlinear regression model model: y ~ ifelse(x < mean(x), ya + A * x, yb + B * x) data: data.frame(x, y) AB ya yb -1325.537 -1335.918 2.628 2.652 residual sum-of-squares: 0.06614 Number of iterations to convergence: 1 Achieved convergence tolerance: 2.294e-08
This made me think that if I converted Cx so that it never went beyond the range [min(x),max(x)] , this could solve the problem. In fact, I would like at least three data points to be available for each line βAβ and line βBβ, so Cx should be between the third lowest and third highest x value. Using the atan() function with the appropriate arithmetic , let me map the range [-inf,+inf] to [0,1] , so I got the code:
trans <- function(x) 0.5+atan(x)/pi xs <- sort(x) xlo <- xs[3] xhi <- xs[length(xs)-2] nls(y ~ ifelse(x < xlo+(xhi-xlo)*trans(f),ya+A*x,yb+B*x), data = data.frame(x,y), start = c(A=-1000,B=-1000,ya=3,yb=0,f=0))
Unfortunately, however, I still get the singular gradient matrix at initial parameters error from this code, so the problem is still overridden. As @Henrik suggested, the difference between bilinear and single linear fit is small for this data.
Nevertheless, I can get an answer to the bilinear landing. Since nls() solves the problem when Cx fixed, now I can find a Cx value that minimizes the residual standard error by simply doing one-dimensional minimization using optimize() . Not a particularly elegant solution, but better than nothing:
xs <- sort(x) xlo <- xs[3] xhi <- xs[length(xs)-2] nn <- function(f) nls(y ~ ifelse(x < xlo+(xhi-xlo)*f,ya+A*x,yb+B*x), data = data.frame(x,y), start = c(A=-1000,B=-1000,ya=3,yb=0)) ssr <- function(f) sum(residuals(nn(f))^2) f = optimize(ssr,interval=c(0,1)) print (f$minimum) print (nn(f$minimum)) summary(nn(f$minimum))
... displays the result:
[1] 0.8541683 Nonlinear regression model model: y ~ ifelse(x < xlo + (xhi - xlo) * f, ya + A * x, yb + B * x) data: data.frame(x, y) AB ya yb -1317.215 -872.002 2.620 1.407 residual sum-of-squares: 0.0414 Number of iterations to convergence: 1 Achieved convergence tolerance: 2.913e-08 Formula: y ~ ifelse(x < xlo + (xhi - xlo) * f, ya + A * x, yb + B * x) Parameters: Estimate Std. Error t value Pr(>|t|) A -1.317e+03 1.792e+01 -73.493 < 2e-16 *** B -8.720e+02 1.207e+02 -7.222 1.14e-07 *** ya 2.620e+00 2.791e-02 93.854 < 2e-16 *** yb 1.407e+00 3.200e-01 4.399 0.000164 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.0399 on 26 degrees of freedom Number of iterations to convergence: 1
There is not much difference between the values ββof A and B and ya and yb for the optimal value of f , but there is some difference.
(EDIT - ELEGANT ANSWER)
Having separated the problem into two steps, there is no need to use nls() . lm() works fine:
function (x,y) { f <- function (Cx) { lhs <- function(x) ifelse(x < Cx,Cx-x,0) rhs <- function(x) ifelse(x < Cx,0,x-Cx) fit <- lm(y ~ lhs(x) + rhs(x)) c(summary(fit)$r.squared, summary(fit)$coef[1], summary(fit)$coef[2], summary(fit)$coef[3]) } r2 <- function(x) -(f(x)[1]) res <- optimize(r2,interval=c(min(x),max(x))) res <- c(res$minimum,f(res$minimum)) best_Cx <- res[1] coef1 <- res[3] coef2 <- res[4] coef3 <- res[5] plot(x,y) abline(coef1+best_Cx*coef2,-coef2)
... which gives:
