Exponential curve in R

time = 1:100 head(y) 0.07841589 0.07686316 0.07534116 0.07384931 0.07238699 0.07095363 plot(time,y) 

enter image description here

This is an exponential curve.
1) How can I put a line on this curve without knowing the formula? I cannot use "nls" because the formula is unknown (only data is given).
2) How can I get the equation for this curve and determine the constants. in the equation?
I tried loess, but it does not intercept

+5
source share
3 answers

EDIT: mra68 answer made me realize that for data with some noise (which is much more realistic) my original answer does not work well, so I updated it with a potential solution using the nls() function.


Original answer

As stated in Adam , I believe that this can be solved by taking the logarithm of your variables and then fitting the linear model.

Your hypothesis is that y = time n , and you want to evaluate n. First take the logarithm on both sides: log (y) = log (time n )

Then consider the logarithmic rule that log (x n ) = n.log (x), and therefore: log (y) = n.log (time) , which is a linear equation.

So, to be more specific:

 #Simulate some data #I will use 25 as the exponent, but in your case this is unknown time = 1:100 y = time^25 plot(time, y) 

plot1

 #Plot the log of both variables. plot(log(time), log(y)) 

plot2

 #Fit the linear model fit = lm(log(y) ~ log(time)) # Check that the estimated coefficient is 25, just as we expected! fit$coefficients # (Intercept) log(time) # -1.477929e-13 2.500000e+01 #Plot the fitted line plot(time, y) lines(time, time ^ fit$coefficients[2], col = "red") 

plot3


Updated Answer

When some noise is introduced, using the above solution gives lower estimates than would be expected. The nls() function seems to overcome the problem somewhat, as shown here.

 #Simulate some data with noise added set.seed(10021) time = 1:100 y2 = (time + rnorm(100, sd = 2))^25 # Plot both non-transformed and log-transformed data par(mfrow = c(1, 2)) plot(time, y2) plot(log(time), log(y2)) lines(log(time), log(time^25), col = "red") # line when there is no noise 

plot4

The graph on the right shows that for lower values ​​the logarithmic transformation leads to large differences with respect to the basic (in our case, well-known) model. This is further illustrated with the rest of lm() fit:

 # Fit the model using the log-transformed variables fit_lm = lm(log(y2) ~ log(time)) # Plot fitted values vs. residuals plot(fit_lm, which = 1) # And the estimated coefficient is slightly above known coef(fit_lm) # (Intercept) log(time) # -9.327772 27.383641 

plot6

Using nls() seems to improve exponent evaluation.

 # Fit using nls fit_nls = nls(y2 ~ (time ^ b), start = c(b = 24), trace = T) # The coefficient is much closer to the known coef(fit_nls) # b # 25.04061 # Plot of data and two estimates plot(time, y2) lines(time, time^coef(fit_nls), col = "red") lines(time, time^coef(fit_lm)[2], col = "green3") legend("topleft", c("fit_lm", "fit_nls"), lwd = 2, col = c("green3", "red")) 

plot6

+9
source

Unfortunately, the logarithm and setting the linear model are not optimal. The reason is that the errors for large y values ​​are much larger than those for small y values ​​when applying the exponential function to return to the original model. Here is one example:

 f <- function(x){exp(0.3*x+5)} squaredError <- function(a,b,x,y) {sum((exp(a*x+b)-f(x))^2)} x <- 0:12 y <- f(x) * ( 1 + sample(-300:300,length(x),replace=TRUE)/10000 ) x y #-------------------------------------------------------------------- M <- lm(log(y)~x) a <- unlist(M[1])[2] b <- unlist(M[1])[1] print(c(a,b)) squaredError(a,b,x,y) approxPartAbl_a <- (squaredError(a+1e-8,b,x,y) - squaredError(a,b,x,y))/1e-8 for ( i in 0:10 ) { eps <- -i*sign(approxPartAbl_a)*1e-5 print(c(eps,squaredError(a+eps,b,x,y))) } 

Result:

 > f <- function(x){exp(0.3*x+5)} > squaredError <- function(a,b,x,y) {sum((exp(a*x+b)-f(x))^2)} > x <- 0:12 > y <- f(x) * ( 1 + sample(-300:300,length(x),replace=TRUE)/10000 ) > x [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 > y [1] 151.2182 203.4020 278.3769 366.8992 503.5895 682.4353 880.1597 1186.5158 1630.9129 2238.1607 3035.8076 4094.6925 5559.3036 > #-------------------------------------------------------------------- > > M <- lm(log(y)~x) > a <- unlist(M[1])[2] > b <- unlist(M[1])[1] > print(c(a,b)) coefficients.x coefficients.(Intercept) 0.2995808 5.0135529 > squaredError(a,b,x,y) [1] 5409.752 > approxPartAbl_a <- (squaredError(a+1e-8,b,x,y) - squaredError(a,b,x,y))/1e-8 > for ( i in 0:10 ) + { + eps <- -i*sign(approxPartAbl_a)*1e-5 + print(c(eps,squaredError(a+eps,b,x,y))) + } [1] 0.000 5409.752 [1] -0.00001 5282.91927 [1] -0.00002 5157.68422 [1] -0.00003 5034.04589 [1] -0.00004 4912.00375 [1] -0.00005 4791.55728 [1] -0.00006 4672.70592 [1] -0.00007 4555.44917 [1] -0.00008 4439.78647 [1] -0.00009 4325.71730 [1] -0.0001 4213.2411 > 

Perhaps you can try some kind of numerical method, i.e. gradient search to find the minimum of the quadratic error function.

Of course, this is not a great answer. Please do not punish me.

+3
source

If this is truly exponential, you can try taking the logarithm of your variable and fitting a linear model to it.

+2
source

All Articles