The coolest descent pours out unreasonably high values

Question

The coolest descent pours out unreasonably high values

My implementation of the fastest descent for the solution Ax = b shows some strange behavior: for any sufficiently large matrix (~ 10 x 10 , only tested square matrices), the returned x contains all the huge values (on the order of 1x10^10 ).

 def steepestDescent(A, b, numIter=100, x=None): """Solves Ax = b using steepest descent method""" warnings.filterwarnings(action="error",category=RuntimeWarning) # Reshape b in case it has shape (nL,) b = b.reshape(len(b), 1) exes = [] res = [] # Make a guess for x if none is provided if x==None: x = np.zeros((len(A[0]), 1)) exes.append(x) for i in range(numIter): # Re-calculate r(i) using r(i) = b - Ax(i) every five iterations # to prevent roundoff error. Also calculates initial direction # of steepest descent. if (numIter % 5)==0: r = b - np.dot(A, x) # Otherwise use r(i+1) = r(i) - step * Ar(i) else: r = r - step * np.dot(A, r) res.append(r) # Calculate step size. Catching the runtime warning allows the function # to stop and return before all iterations are completed. This is # necessary because once the solution x has been found, r = 0, so the # calculation below divides by 0, turning step into "nan", which then # goes on to overwrite the correct answer in x with "nan"s try: step = np.dot(rT, r) / np.dot( np.dot(rT, A), r ) except RuntimeWarning: warnings.resetwarnings() return x # Update x x = x + step * r exes.append(x) warnings.resetwarnings() return x, exes, res

( exes and res returned for debugging)

I suppose the problem should be to compute r or step (or some deeper problem), but I can't figure out what it is.

+6

python numpy numerical-methods mathematical-optimization gradient-descent

Cole zimmerman Jul 26 '16 at 15:38

source share

1 answer

Eolmar · Answer 1 · 2018-03-28T11:12:04+0000

The code seems to be correct. For example, the following test works for me (both linalg.solve and sweepestDescent give a great answer):

 import numpy as np n = 100 A = np.random.random(size=(n,n)) + 10 * np.eye(n) print(np.linalg.eig(A)[0]) b = np.random.random(size=(n,1)) x, xs, r = steepestDescent(A,b, numIter=50) print(x - np.linalg.solve(A,b))

The problem is math. This algorithm is guaranteed to converge to the correct solution if A is a positive definite matrix. By adding a decimal identity matrix to a random matrix, we increase the probability that all eigenvalues are positive

If you check with large random matrices (e.g. A = random.random(size=(n,n)) , you will almost certainly have a negative eigenvalue and the algorithm will not converge.

The coolest descent pours out unreasonably high values

More articles: