Scipy Fit Fails Power Law Curve

So, I'm trying to find a data set with a power law of the following form:

def f(x,N,a): # Power law fit if a >0: return N*x**(-a) else: return 10.**300 par,cov = scipy.optimize.curve_fit(f,data,time,array([10**(-7),1.2])) 

where the else condition is simply to make a be positive. Using scipy.optimize.curve_fit gives a terrible fit (green line) , returning the values ​​1.2e + 04 and 1.9e0-7 for N and a, respectively, absolutely not overlapping with the data. Of the sets that I entered manually, the values ​​should be approximately 1e-07 and 1.2 for N and a, respectively, but putting them in curve_fit, since the initial parameters do not change the result. Removing the condition for a positive result leads to a worse approach, since he chooses a negative, which leads to a fit with an incorrect slope of the sign.

I can't figure out how to get a plausible, not to mention reliable, fitting of this routine, but I can't find any other good routines for picking Python. Do I need to write my own least squares algorithm or is there something I am doing wrong here?

+6
source share
1 answer

UPDATE

In the original post, I showed a solution that uses lmfit , which allows you to assign borders to your parameters. Starting with version 0.17, scipy also allows you to directly bind borders to your parameters (see the documentation ). Below you will find this solution after EDIT , which we hope will serve as a minimal example of how to use scipy curve_fit with parameter boundaries.

Original post

As suggested by @Warren Weckesser, you can use lmfit to accomplish this task, which allows you to assign boundaries to your parameters and avoids this "ugly" if statement.

Since you are not providing any data, I created some of them that are shown here:

enter image description here

They follow the law f(x) = 10.5 * x ** (-0.08)

I adapt them, as suggested by @ roadrunner66, by converting a power law to a linear function:

 y = N * x ** a ln(y) = ln(N * x ** a) ln(y) = a * ln(x) + ln(N) 

So, I first use np.log as np.log , and then do the fit. When I use lmfit now, I get the following output:

 [[Variables]] lN: 2.35450302 +/- 0.019531 (0.83%) (init= 1.704748) a: -0.08035342 +/- 0.005158 (6.42%) (init=-0.5) 

So a pretty close to the original value, and np.exp(2.35450302) gives 10.53, which is also very close to the original value.

Then the graph looks like this: since you can see that the fit describes the data very well:

enter image description here

Here is all the code with a few inline comments:

 import numpy as np import matplotlib.pyplot as plt from lmfit import minimize, Parameters, Parameter, report_fit # generate some data with noise xData = np.linspace(0.01, 100., 50.) aOrg = 0.08 Norg = 10.5 yData = Norg * xData ** (-aOrg) + np.random.normal(0, 0.5, len(xData)) plt.plot(xData, yData, 'bo') plt.show() # transform data so that we can use a linear fit lx = np.log(xData) ly = np.log(yData) plt.plot(lx, ly, 'bo') plt.show() def decay(params, x, data): lN = params['lN'].value a = params['a'].value # our linear model model = a * x + lN return model - data # that what you want to minimize # create a set of Parameters params = Parameters() params.add('lN', value=np.log(5.5), min=0.01, max=100) # value is the initial value params.add('a', value=-0.5, min=-1, max=-0.001) # min, max define parameter bounds # do fit, here with leastsq model result = minimize(decay, params, args=(lx, ly)) # write error report report_fit(params) # plot data xnew = np.linspace(0., 100., 5000.) # plot the data plt.plot(xData, yData, 'bo') plt.plot(xnew, np.exp(result.values['lN']) * xnew ** (result.values['a']), 'r') plt.show() 

EDIT

Assuming you have scipy 0.17 installed, you can also do the following: curve_fit . I show it for your initial definition of a power law (the red line in the graph below), as well as for the logarithmic data (black line in the graph below). Data is generated as described above. The plot is as follows:

enter image description here

As you can see, the data are described very well. If you print popt and popt_log , you get respectively array([ 10.47463426, 0.07914812]) and array([ 2.35158653, -0.08045776]) (note: for letter 1 you have to take the exponent of the first argument - np.exp(popt_log[0]) = 10.502 , which is close to the original data).

Here is the whole code:

 import numpy as np import matplotlib.pyplot as plt from scipy.optimize import curve_fit # generate some data with noise xData = np.linspace(0.01, 100., 50) aOrg = 0.08 Norg = 10.5 yData = Norg * xData ** (-aOrg) + np.random.normal(0, 0.5, len(xData)) # get logarithmic data lx = np.log(xData) ly = np.log(yData) def f(x, N, a): return N * x ** (-a) def f_log(x, lN, a): return a * x + lN # optimize using the appropriate bounds popt, pcov = curve_fit(f, xData, yData, bounds=(0, [30., 20.])) popt_log, pcov_log = curve_fit(f_log, lx, ly, bounds=([0, -10], [30., 20.])) xnew = np.linspace(0.01, 100., 5000) # plot the data plt.plot(xData, yData, 'bo') plt.plot(xnew, f(xnew, *popt), 'r') plt.plot(xnew, f(xnew, np.exp(popt_log[0]), -popt_log[1]), 'k') plt.show() 
+3
source

All Articles