Minimizing multiparameter functions using scipy. Derivatives are not known.

I have a function that is actually a call to another program (some Fortran code). When I call this function ( run_moog ), I can run_moog 4 variables and return 6 values. These values ​​should be close to 0 (to minimize). However, I combined them like this: np.sum(results**2) . Now I have a scalar function. I would like to minimize this feature, i.e. Get np.sum(results**2) as close to zero as possible.
Note When this function ( run_moog ) takes 4 input parameters, it creates an input file for Fortran code, which depends on these parameters.

I tried several ways to optimize this from scipy docs . But no one works as expected. Minimization should have restrictions on 4 variables. Here is an attempt:

 from scipy.optimize import minimize # Tried others as well from the docs x0 = 4435, 3.54, 0.13, 2.4 bounds = [(4000, 6000), (3.00, 4.50), (-0.1, 0.1), (0.0, None)] a = minimize(fun_mmog, x0, bounds=bounds, method='L-BFGS-B') # I've tried several different methods here print a 

It gives me

  status: 0 success: True nfev: 5 fun: 2.3194639999999964 x: array([ 4.43500000e+03, 3.54000000e+00, 1.00000000e-01, 2.40000000e+00]) message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL' jac: array([ 0., 0., -54090399.99999981, 0.]) nit: 0 

The third parameter changes slightly, and the rest are exactly the same. There were also 5 function calls ( nfev ), but no iterations ( nit ). The output from scipy is shown here.

+8
python scipy mathematical-optimization minimization
source share
4 answers

A couple of features:

  • Try COBYLA. It should be derivative-free and support the limitations of inequality.
  • You cannot use different epsilons through the normal interface; so try scaling your first variable to 1e4. (Separate it by going over again.)
  • Skip the usual jacobian auto constructor and make your own:

Suppose you are trying to use SLSQP and you are not providing a jacobian function. That makes it for you. The code for it is in approx_jacobian in slsqp.py . Here's the condensed version:

 def approx_jacobian(x,func,epsilon,*args): x0 = asfarray(x) f0 = atleast_1d(func(*((x0,)+args))) jac = zeros([len(x0),len(f0)]) dx = zeros(len(x0)) for i in range(len(x0)): dx[i] = epsilon jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon dx[i] = 0.0 return jac.transpose() 

You can try replacing this loop as follows:

  for (i, e) in zip(range(len(x0)), epsilon): dx[i] = e jac[i] = (func(*((x0+dx,)+args)) - f0)/e dx[i] = 0.0 

You cannot provide this as jacobian to minimize , but fixing it for this is simple:

 def construct_jacobian(func,epsilon): def jac(x, *args): x0 = asfarray(x) f0 = atleast_1d(func(*((x0,)+args))) jac = zeros([len(x0),len(f0)]) dx = zeros(len(x0)) for i in range(len(x0)): dx[i] = epsilon jac[i] = (func(*((x0+dx,)+args)) - f0)/epsilon dx[i] = 0.0 return jac.transpose() return jac 

Then you can call minimize as:

 minimize(fun_mmog, x0, jac=construct_jacobian(fun_mmog, [1e0, 1e-4, 1e-4, 1e-4]), bounds=bounds, method='SLSQP') 
+7
source share

It looks like your target function has no good derivatives. The output line jac: array([ 0., 0., -54090399.99999981, 0.]) means that changing only the third value of the variable is significant. And since the wrt derivative of this variable is almost infinite, there is probably something wrong with the function. That is why the third value of the variable ends at maximum.

I would suggest you take a look at the derivatives, at least at several points in your parameter space. Calculate them using finite differences and default step size SciPy fmin_l_bfgs_b , 1e-8 . Here is an example of how you could calculate the derivatives.

Try also setting the objective function. For example, keep two parameters constant and let the other two vary. If a function has several local optimizations, you should not use gradient-based methods such as BFGS.

+2
source share

How difficult is it to get an analytic expression for the gradient? If you have one, then you can then approximate the Hessian product by a vector using a finite difference. Then you can use other available optimization routines.

Among the various optimization routines available in SciPy, the one called TNC (Newton Conjugate Gradient with Truncation) is quite resistant to the numerical values ​​associated with the problem.

+1
source share

The Nelder-Mead sympathy method (proposed by Cristián Antuña in the commentary above), as is well known, is a good choice for optimizing (apparently bad behavior) functions without knowledge of derivatives (see Numerical Recipes in C, Chapter 10 ).

There are two specific aspects to your question. Firstly, these are restrictions on inputs, and the second is the problem of scaling. The following offers solutions for these points, but you may have to manually iterate between them several times until everything works.

Input restrictions

Assuming that your input constraints form a convex region (as the examples above show, but I would like to generalize it), then you can write a function

 is_in_bounds(p): # Return if p is in the bounds 

Using this function, suppose that the algorithm wants to go from the point from_ to the point to , where from_ , as you know, is in this area. Then the following function will effectively find the further point on the line between two points at which it can continue:

 from numpy.linalg import norm def progress_within_bounds(from_, to, eps): """ from_ -- source (in region) to -- target point eps -- Eucliedan precision along the line """ if norm(from_, to) < eps: return from_ mid = (from_ + to) / 2 if is_in_bounds(mid): return progress_within_bounds(mid, to, eps) return progress_within_bounds(from_, mid, eps) 

(Please note: this function can be optimized for some regions, but it is hardly worth worrying, because it does not even call your original function of the object, which is expensive.)

One of the nice aspects of Nelder-Mead is that the function takes a series of steps that are so intuitive. Some of these points can obviously kick you out of the region, but this is easy to change. The following is an implementation of Nelder Mead with changes made between pairs of lines of the form ################################################################## :

 import copy ''' Pure Python/Numpy implementation of the Nelder-Mead algorithm. Reference: https://en.wikipedia.org/wiki/Nelder%E2%80%93Mead_method ''' def nelder_mead(f, x_start, step=0.1, no_improve_thr=10e-6, no_improv_break=10, max_iter=0, alpha = 1., gamma = 2., rho = -0.5, sigma = 0.5): ''' @param f (function): function to optimize, must return a scalar score and operate over a numpy array of the same dimensions as x_start @param x_start (numpy array): initial position @param step (float): look-around radius in initial step @no_improv_thr, no_improv_break (float, int): break after no_improv_break iterations with an improvement lower than no_improv_thr @max_iter (int): always break after this number of iterations. Set it to 0 to loop indefinitely. @alpha, gamma, rho, sigma (floats): parameters of the algorithm (see Wikipedia page for reference) ''' # init dim = len(x_start) prev_best = f(x_start) no_improv = 0 res = [[x_start, prev_best]] for i in range(dim): x = copy.copy(x_start) x[i] = x[i] + step score = f(x) res.append([x, score]) # simplex iter iters = 0 while 1: # order res.sort(key = lambda x: x[1]) best = res[0][1] # break after max_iter if max_iter and iters >= max_iter: return res[0] iters += 1 # break after no_improv_break iterations with no improvement print '...best so far:', best if best < prev_best - no_improve_thr: no_improv = 0 prev_best = best else: no_improv += 1 if no_improv >= no_improv_break: return res[0] # centroid x0 = [0.] * dim for tup in res[:-1]: for i, c in enumerate(tup[0]): x0[i] += c / (len(res)-1) # reflection xr = x0 + alpha*(x0 - res[-1][0]) ################################################################## ################################################################## xr = progress_within_bounds(x0, x0 + alpha*(x0 - res[-1][0]), prog_eps) ################################################################## ################################################################## rscore = f(xr) if res[0][1] <= rscore < res[-2][1]: del res[-1] res.append([xr, rscore]) continue # expansion if rscore < res[0][1]: xe = x0 + gamma*(x0 - res[-1][0]) ################################################################## ################################################################## xe = progress_within_bounds(x0, x0 + gamma*(x0 - res[-1][0]), prog_eps) ################################################################## ################################################################## escore = f(xe) if escore < rscore: del res[-1] res.append([xe, escore]) continue else: del res[-1] res.append([xr, rscore]) continue # contraction xc = x0 + rho*(x0 - res[-1][0]) ################################################################## ################################################################## xc = progress_within_bounds(x0, x0 + rho*(x0 - res[-1][0]), prog_eps) ################################################################## ################################################################## cscore = f(xc) if cscore < res[-1][1]: del res[-1] res.append([xc, cscore]) continue # reduction x1 = res[0][0] nres = [] for tup in res: redx = x1 + sigma*(tup[0] - x1) score = f(redx) nres.append([redx, score]) res = nres 

Note This is a GPL implementation that is suitable for you or not. However, it is very easy to modify NM from any pseudo-code, and you can add simulated annealing in any case.

scaling

This is a more complex issue, but jasaarim made an interesting point. Once the modified NM algorithm finds a point, you can run matplotlib.contour by setting a few dimensions to see how the function behaves. At this point, you may need to scale one or more parameters and restart the modified NM.

-

+1
source share

All Articles