@ Alef7, the initialization of Xavier / Glorot depends on the number of incoming connections (fan_in), the number of outgoing connections (fan_out) and the type of activation function (sigmoid or tanh) of the neuron. See this: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
So now, to your question. Here's how I would do it in TensorFlow:
(fan_in, fan_out) = ... low = -4*np.sqrt(6.0/(fan_in + fan_out)) # use 4 for sigmoid, 1 for tanh activation high = 4*np.sqrt(6.0/(fan_in + fan_out)) return tf.Variable(tf.random_uniform(shape, minval=low, maxval=high, dtype=tf.float32))
Note that we should be a sample from a uniform distribution, not a normal distribution, as suggested in another answer.
By the way, I wrote a post yesterday for something else using TensorFlow, which also uses Xavier initialization. If you're interested, there is also a python laptop with a walk-through example: https://github.com/delip/blog-stuff/blob/master/tensorflow_ufp.ipynb
Delip Nov 14 '15 at 17:37 2015-11-14 17:37
source share