I am working on this class in convolutional neural networks. I am trying to implement a loss function gradient for svm and (I have a copy of the solution). I am having trouble understanding why the solution is correct.
On the this page, she defines the gradient of the loss function as follows:
In my code, I have my analytic gradient the same as a numerical gradient when it is implemented in the code as follows:
dW = np.zeros(W.shape) # initialize the gradient as zero # compute the loss and the gradient num_classes = W.shape[1] num_train = X.shape[0] loss = 0.0 for i in xrange(num_train): scores = X[i].dot(W) correct_class_score = scores[y[i]] for j in xrange(num_classes): if j == y[i]: if margin > 0: continue margin = scores[j] - correct_class_score + 1 # note delta = 1 if margin > 0: dW[:, y[i]] += -X[i] dW[:, j] += X[i] # gradient update for incorrect rows loss += margin
However, from the notes it seems that dW[:, y[i]] should be changed every time j == y[i] , since we subtract losses whenever j == y[i] . I am very confused why there is no code:
dW = np.zeros(W.shape) # initialize the gradient as zero # compute the loss and the gradient num_classes = W.shape[1] num_train = X.shape[0] loss = 0.0 for i in xrange(num_train): scores = X[i].dot(W) correct_class_score = scores[y[i]] for j in xrange(num_classes): if j == y[i]: if margin > 0: dW[:, y[i]] += -X[i] continue margin = scores[j] - correct_class_score + 1 # note delta = 1 if margin > 0: dW[:, j] += X[i] # gradient update for incorrect rows loss += margin
and the loss will change with j == y[i] . Why are they both calculated when J != y[i] ?
python computer-vision svm linear-regression gradient-descent
David
source share