I am trying to understand a simple implementation of the Softmax classifier from this link - CS231n - Convolutional neural networks for visual recognition . Here they implemented a simple softmax classifier. In the Softmax Classifier example, the link contains random 300 points in two-dimensional space and a label associated with them. The softmax classifier finds out which point belongs to the class.
Here is the complete softmax classifier code. Or you can see the link I provided.
W = 0.01 * np.random.randn(D,K)
b = np.zeros((1,K))
step_size = 1e-0
reg = 1e-3 # regularization strength
num_examples = X.shape[0]
for i in xrange(200):
scores = np.dot(X, W) + b
exp_scores = np.exp(scores)
probs = exp_scores / np.sum(exp_scores, axis=1, keepdims=True) # [N x K]
corect_logprobs = -np.log(probs[range(num_examples),y])
data_loss = np.sum(corect_logprobs)/num_examples
reg_loss = 0.5*reg*np.sum(W*W)
loss = data_loss + reg_loss
if i % 10 == 0:
print "iteration %d: loss %f" % (i, loss)
dscores = probs
dscores[range(num_examples),y] -= 1
dscores /= num_examples
dW = np.dot(X.T, dscores)
db = np.sum(dscores, axis=0, keepdims=True)
dW += reg*W # regularization gradient
W += -step_size * dW
b += -step_size * db
I cannot understand how they calculated the gradient here. I guess they calculated the gradient here -
dW = np.dot(X.T, dscores)
db = np.sum(dscores, axis=0, keepdims=True)
dW += reg*W # regularization gradient
? , dW np.dot(X.T, dscores)? db np.sum(dscores, axis=0, keepdims=True)? , ? regularization gradient?
. , CS231n - Convolutional Neural Networks for Visual Recognition . , , . , stackoverflow. , , , .