Is there a way that I can effectively make bools in cond, decide whether to return a or b?
Yes you could do
cond * a + (1-cond) * b
cond will be passed in the form (N, M) .
This should be close to the theoretical limit, which is the memory bandwidth: this operation should read N*M elements and write N*M
Instead, we read 2*N*M , but remove the conditional logic.
(I donβt have Anano in front of me, so Iβm not sure that it is faster than T.switch , but it should be about as good as him. Also, I would try casting cond the same dtype as a and b )
If you want to update a in place, you can do this using T.set_subtensor :
a = np.random.uniform(size=(N, M)).astype(np.float32) b = np.random.uniform(size=(N, M)).astype(np.float32) a = theano.shared(a) b = theano.shared(b) c = T.vector() # mostly 0, presumably (1-cond) nz = T.nonzero(c) s = T.set_subtensor(a[nz], b[nz]) fn = theano.function([c], [], updates=[(a, s)]) ... fn(1-cond)
It may or may not be faster than the first approach, depending on N , M and other factors.
Maxb
source share