The derivative of the sum is the sum of the derivatives, i.e.
d(f1 + f2 + f3 + f4)/dx = df1/dx + df2/dx + df3/dx + df4/dx
To get the derivatives of p_j with respect to o_i , we start with:
d_i(p_j) = d_i(exp(o_j) / Sum_k(exp(o_k)))
I decided to use d_i for the derivative with respect to o_i , to make it easier to read. Using the product rule, we get:
d_i(exp(o_j)) / Sum_k(exp(o_k)) + exp(o_j) * d_i(1/Sum_k(exp(o_k)))
Looking at the first term, the derivative will be 0 , if i != j , this can be represented using the function
SirGuy
source share