Firstly, there are many types of ANNs, I assume that you are talking about the simplest one - a backscatter multilayer perceptron.
Secondly, in your question you mix data scaling (normalization) and weight initialization.
You need to accidentally initialize the scales in order to avoid symmetry during training (if all weights initially match, their update will also be the same). In general, specific values ββdo not matter, but too large values ββcan lead to slower convergence.
You are not required to normalize your data, but normalization can speed up the learning process. See this question for more details.
ffriend
source share