I have a problem that I was struggling with. This is due to tf.matmul() and the lack of translation.
I am aware of a similar problem at https://github.com/tensorflow/tensorflow/issues/216 , but tf.batch_matmul() not like the solution for my case.
I need to encode my input as a 4D tensor: X = tf.placeholder(tf.float32, shape=(None, None, None, 100)) The first dimension is the lot size, the second is the number of entries in the lot. You can present each record as a composition of several objects (third dimension). Finally, each object is described by a vector of 100 float values.
Please note that I used None for the second and third dimensions, because the actual sizes may vary in each batch. However, for simplicity, let us form a tensor with real numbers: X = tf.placeholder(tf.float32, shape=(5, 10, 4, 100))
These are the steps of my calculation:
calculate the function of each vector from 100 values ββof the float (for example, a linear function) W = tf.Variable(tf.truncated_normal([100, 50], stddev=0.1)) Y = tf.matmul(X, W) : there is no translation for tf.matmul() and tf.matmul() cannot be used tf.batch_matmul() expected form Y: (5, 10, 4, 50)
application of the average pool for each batch record (above the objects of each record): Y_avg = tf.reduce_mean(Y, 2) expected form Y_avg: (5, 10, 50)
I expected tf.matmul() to support broadcast. Then I found tf.batch_matmul() , but still it looks like it doesnβt apply to my case (for example, W must have at least 3 dimensions, it is not clear why).
By the way, above I used a simple linear function (whose scales are stored in W). But in my model, I have a deep network. So, the more general problem that I have is automatically calculating the function for each slice of the tensor. This is why I expected tf.matmul() to have a broadcast mode (if so, tf.batch_matmul() might not even be needed).
Look forward to learning from you! Alessio
tensorflow broadcasting
Alessio b
source share