TensorFlow controlled backpropagation

Question

TensorFlow controlled backpropagation

I would like to implement in TensorFlow the technique of “Guided Back Propagation” presented in this Paper and which is described in this.

Computationally, which means that when I calculate the gradient, for example, the input signal, the output NN, I will have to change the gradients calculated in each RELU block . Specifically, the feedback on these devices must be set to zero for this method to work. In other words, the partial derivative of negative negative RELU values should be ignored.

Given that I am only interested in applying these gradient calculations to test cases, i.e. I do not want to update the model parameters - how to do this?

I have tried (unsuccessfully) two things so far:

Use tf.py_func to port my simple numpy version of RELU, which can then override the gradient operation using the g.gradient_override_map context manager .
Collect the values of the inverse / inverse of the BackProp and apply the threshold value for those associated with Relus.

I was unable to complete both approaches because they require some knowledge of the internal TF functions that I currently do not have.

Can someone suggest any other route or jot down the code?

Many thanks.

+7

tensorflow backpropagation

Peter Jul 13 '16 at 0:20

source share

2 answers

keveman · Answer 1 · 2016-07-13T01:03:05+0000

tf.gradients has a grad_ys parameter that can be used for this purpose. Suppose your network has only one relu layer as follows:

 before_relu = f1(inputs, params) after_relu = tf.nn.relu(before_relu) loss = f2(after_relu, params, targets)

First, calculate the derivative before after_relu .

 Dafter_relu = tf.gradients(loss, after_relu)[0]

Then set the thresholds of your gradients that you send.

 Dafter_relu_thresholded = tf.select(Dafter_relu < 0.0, 0.0, Dafter_relu)

Compute actual wrt gradients to params .

 Dparams = tf.gradients(after_relu, params, grad_ys=Dafter_relu_thresholded)

You can easily extend the same method for a network with many relu layers.

Falcon · Answer 2 · 2016-08-05T01:33:46+0000

Better solution (your approach 1) with ops.RegisterGradient and tf.Graph.gradient_override_map . Together, they override the calculation of the gradient for a predetermined Op, for example. Relu in the context of gradient_override_map , using only python code.

 @ops.RegisterGradient("GuidedRelu") def _GuidedReluGrad(op, grad): return tf.where(0. < grad, gen_nn_ops._relu_grad(grad, op.outputs[0]), tf.zeros(grad.get_shape())) ... with g.gradient_override_map({'Relu': 'GuidedRelu'}): y = tf.nn.relu(x)

here is a complete example of a managed relu implementation: https://gist.github.com/falcondai/561d5eec7fed9ebf48751d124a77b087

Update : in Tensorflow> = 1.0, tf.select renamed to tf.where . I updated the snippet accordingly. (Thanks @sbond for catching my attention :)

TensorFlow controlled backpropagation

More articles: