I'm not quite sure what exactly tf.nn.separable_conv2d does. It seems that pointwise_filter is a scaling factor for different functions when generating one pixel of the next layer. But I'm not sure if my interpretation is correct. Is there a link for this method and what is the use?
tf.nn.separable_conv2d generates the same form as tf.nn.conv2d. I would suggest that I can replace tf.nn.conv2d with tf.nn.separable_conv2d. But the result when using tf.nn.separable_conv2d seems very poor. The network stopped learning very early. For the MNIST dataset, accuracy is just a random guess of ~ 10%.
I thought that when I set pointwise_filter to 1.0 and make them non-synchronous, I get the same as tf.nn.conv2d. But actually ... another ~ 10% accuracy.
But when tf.nn.conv2d is used with the same hyperparameters, the accuracy can be 99%. Why?
In addition, this requires channel_multiplier * in_channels <out_channels. What for? What is the role of channel_multiplier here?
Thanks.
Edit:
I used channel_multiplier previously as 1.0. Maybe this is a bad choice. After I change it to 2.0, the accuracy becomes much better. But what is the role of channel_multiplier? Why is 1.0 not a good value?
source share