Wouldn't initializing weights to 0 be a better idea? Thus, can weights find their values faster (positive or negative)?
How does symmetry breaking make learning faster?
If you initialize all weights to zero, then all neurons of all layers will perform the same calculations, providing the same result and outputting them, making the entire deep network useless. If the weights are zero, the complexity of the entire deep network will be the same as that of a single neuron, and the predictions will be no better than random.
The nodes located next to the hidden layer connected to the same inputs must have different weights so that the learning algorithm can update the weights.
By making the weights nonzero (but close to 0, like 0.1, etc.), the Algorithm will study the weights in the following iterations and will not freeze. Thus, symmetry breaking occurs.
- Is there any other basic philosophy behind the randomization of weights besides the hope that they will be close to the optimal values during initialization?
Stochastic optimization algorithms, such as stochastic gradient descent, use randomness when choosing the starting point for the search and during the search.
The process of finding or training a neural network is known as convergence. Finding a suboptimal solution or local optima leads to premature convergence.
Instead of relying on one local optimum, if you run your algorithm several times with different random weights, there is a better way to find a global optimum without getting stuck in a local optimum.
After 2015, due to advances in machine learning research, He-et-al Initialization was introduced instead of random initialization
w=np.random.randn(layer_size[l],layer_size[l-1])*np.sqrt(2/layer_size[l-1])
Weights are still random, but vary in range depending on the size of the previous layer of neurons.
So non-zero random weights help us
- Get out of local optima
- Symmetry breaking
- Reach global optimality in further iterations
Recommendations:
machinelearningmastery
towardsdatascience
Ravindra babu Mar 27 '19 at 18:54 2019-03-27 18:54
source share