Is it possible to use the batch normalization layer immediately after the input layer and not normalize my data? Can I expect a similar effect / performance?
In keras functionality, it will be something like this:
x = Input (...) x = Batchnorm(...)(x) ...
Can you do this. But the good thing about batchnorm, in addition to stabilizing the distribution of activation, is that the mean and std deviation are likely to migrate as the network learns.
, batchnorm , . , (, ). , . , batchnorm , .