Incorrect landing using scipy curve_fit

I am trying to put some data into a power law function with exponential shutdown. I am generating some data with numpy and I am trying to customize this data with scipy.optimization. Here is my code:

import numpy as np from scipy.optimize import curve_fit def func(x, A, B, alpha): return A * x**alpha * np.exp(B * x) xdata = np.linspace(1, 10**8, 1000) ydata = func(xdata, 0.004, -2*10**-8, -0.75) popt, pcov = curve_fit(func, xdata, ydata) print popt 

As a result, I get: [1, 1, 1], which does not match the data. ΒΏAm I doing something wrong?

+6
source share
2 answers

While xnx gave you an answer on why curve_fit failed here, I thought I was proposing another way to approach the problem of fitting your functional form, which does not depend on gradient descent (and therefore, a reasonable initial think)

Please note that if you take the log of a suitable function, you get the form

\ log f = \ log A + \ alpha \ log x + B x

which is linear in each of the unknown parameters (log A, alpha, B)

Therefore, we can use the linear algebra mechanism to solve this by writing an equation in the form of a matrix in the form

log y = M p

Where log y is the column vector of the log of your ydata points, p is the column vector of unknown parameters, and M is the matrix [[1], [log x], [x]]

Or obviously

enter image description here

The best suitable parameter vector can be found efficiently using np.linalg.lstsq

Your sample code problem might be written as

 import numpy as np def func(x, A, B, alpha): return A * x**alpha * np.exp(B * x) A_true = 0.004 alpha_true = -0.75 B_true = -2*10**-8 xdata = np.linspace(1, 10**8, 1000) ydata = func(xdata, A_true, B_true, alpha_true) M = np.vstack([np.ones(len(xdata)), np.log(xdata), xdata]).T logA, alpha, B = np.linalg.lstsq(M, np.log(ydata))[0] print "A =", np.exp(logA) print "alpha =", alpha print "B =", B 

Well restores the original parameters:

 A = 0.00400000003736 alpha = -0.750000000928 B = -1.9999999934e-08 

Also note that this method is about 20 times faster than using curve_fit for this problem

 In [8]: %timeit np.linalg.lstsq(np.vstack([np.ones(len(xdata)), np.log(xdata), xdata]).T, np.log(ydata)) 10000 loops, best of 3: 169 Β΅s per loop In [2]: %timeit curve_fit(func, xdata, ydata, [0.01, -5e-7, -0.4]) 100 loops, best of 3: 4.44 ms per loop 
+4
source

Apparently, your initial assumption (which defaults to [1,1,1] , since you did not specify it - see documents ) is too far from the actual parameters for the algorithm to converge. The main problem is probably related to B , which, if positive, will send your exponential function to very large values ​​for the provided xdata .

Try a little closer to the actual parameters and it works:

 p0 = 0.01, -5e-7, -0.4 # Initial guess for the parameters popt, pcov = curve_fit(func, xdata, ydata, p0) print popt 

Output:

 [ 4.00000000e-03 -2.00000000e-08 -7.50000000e-01] 
+2
source

All Articles