Minimizing a multidimensional, differentiable function using scipy.optimize

I am trying to minimize the following function using scipy.optimize :

enter image description here

whose gradient is this:

enter image description here

(For those interested, this is the likelihood function of the Bradley-Terry-Luce model for pairwise comparisons. Very closely related to logistic regression.)

It is quite clear that adding a constant to all parameters does not change the value of the function. Therefore, I give \ theta_1 = 0. Here is the implementation of the object functions and the gradient in python (here theta here is x ):

 def objective(x): x = np.insert(x, 0, 0.0) tiles = np.tile(x, (len(x), 1)) combs = tiles.T - tiles exps = np.dstack((zeros, combs)) return np.sum(cijs * scipy.misc.logsumexp(exps, axis=2)) def gradient(x): zeros = np.zeros(cijs.shape) x = np.insert(x, 0, 0.0) tiles = np.tile(x, (len(x), 1)) combs = tiles - tiles.T one = 1.0 / (np.exp(combs) + 1) two = 1.0 / (np.exp(combs.T) + 1) mat = (cijs * one) + (cijs.T * two) grad = np.sum(mat, axis=0) return grad[1:] # Don't return the first element 

Here is an example of what cijs might look cijs :

 [[ 0 5 1 4 6] [ 4 0 2 2 0] [ 6 4 0 9 3] [ 6 8 3 0 5] [10 7 11 4 0]] 

This is the code that I executed to do the minimization:

 x0 = numpy.random.random(nb_items - 1) # Let try one algorithm... xopt1 = scipy.optimize.fmin_bfgs(objective, x0, fprime=gradient, disp=True) # And another one... xopt2 = scipy.optimize.fmin_cg(objective, x0, fprime=gradient, disp=True) 

However, in the first iteration, it always fails:

 Warning: Desired error not necessarily achieved due to precision loss. Current function value: 73.290610 Iterations: 0 Function evaluations: 38 Gradient evaluations: 27 

I cannot understand why he is failing. The error is displayed due to this line: https://github.com/scipy/scipy/blob/master/scipy/optimize/optimize.py#L853

So this “search in the Wolfe line” seems unsuccessful, but I don’t know how to get from here ... Any help is appreciated!

+7
python numpy scipy mathematical-optimization
source share
2 answers

Like @pv. as a comment, I pointed out an error in calculating the gradient. First of all, the correct (mathematical) expression for the gradient of my objective function:

enter image description here

(note the minus sign.) Also, my Python implementation was completely wrong, besides the sign error. Here is my updated gradient:

 def gradient(x): nb_comparisons = cijs + cijs.T x = np.insert(x, 0, 0.0) tiles = np.tile(x, (len(x), 1)) combs = tiles - tiles.T probs = 1.0 / (np.exp(combs) + 1) mat = (nb_comparisons * probs) - cijs grad = np.sum(mat, axis=1) return grad[1:] # Don't return the first element. 

To debug it, I used:

  • scipy.optimize.check_grad : showed that my gradient function produces results that are very far from an approximate (finite difference) gradient.
  • scipy.optimize.approx_fprime , to get an idea of ​​the values, should look.
  • some simple examples that could be analyzed manually, if necessary, and a few Wolfram Alpha queries for health checks.
+2
source share

It seems you can convert it to a (non-linear) least square problem. Thus, you must define the intervals for each of the variables n and the number of sample points for each variable in order to construct a matrix of coefficients.

In this example, I use the same number of points and the same interval for all variables:

 from scipy.optimize import leastsq from numpy import exp, linspace, zeros, ones n = 4 npts = 1000 xs = [linspace(0, 1, npts) for _ in range(n)] c = ones(n**2) a = zeros((n*npts, n**2)) def residual(c): a.fill(0) for i in range(n): for j in range(n): for k in range(npts): a[i+k*n, i*n+j] = 1/(exp(xs[i][k] - xs[j][k]) + 1) a[i+k*n, j*n+i] = 1/(exp(xs[j][k] - xs[i][k]) + 1) return a.dot(c) popt, pconv = leastsq(residual, x0=c) print(popt.reshape(n, n)) #[[ -1.24886411 1.07854552 -2.67212118 1.86334625] # [ -7.43330057 2.0935734 37.85989442 1.37005925] # [ -3.51761322 -37.49627917 24.90538136 -4.23103535] # [ 11.93000731 2.52750715 -14.84822686 1.38834225]] 

EDIT: More details on the coefficient matrix built above:

enter image description here

+1
source share

All Articles