I'm not sure if this will solve your problem, the documentation for BatchNorm is not very easy to use / informative, so here is a brief description of how to use a simple BatchNorm:
First of all, you define your BatchNorm layer. If you want to use it after an affine / fully connected layer, you will do it (just an example, the order may be different / according to your desire):
... inputs = tf.matmul(inputs, W) + b inputs = tf.layers.batch_normalization(inputs, training=is_training) inputs = tf.nn.relu(inputs) ...
The tf.layers.batch_normalization function calls initializer variables. These are internal variables and require a special area to call, which is located in tf.GraphKeys.UPDATE_OPS . Thus, you should call the optimizer function as follows (after all levels are defined!):
... extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(extra_update_ops): trainer = tf.train.AdamOptimizer() updateModel = trainer.minimize(loss, global_step=global_step) ...
Learn more about this here . I know it's a bit late to answer your question, but it can help other people to encounter BatchNorm problems in tensor flow! :)
source share