TensorFlow controlled backpropagation

I would like to implement in TensorFlow the technique of “Guided Back Propagation” presented in this Paper and which is described in this.

Computationally, which means that when I calculate the gradient, for example, the input signal, the output NN, I will have to change the gradients calculated in each RELU block . Specifically, the feedback on these devices must be set to zero for this method to work. In other words, the partial derivative of negative negative RELU values ​​should be ignored.

Given that I am only interested in applying these gradient calculations to test cases, i.e. I do not want to update the model parameters - how to do this?

I have tried (unsuccessfully) two things so far:

  • Use tf.py_func to port my simple numpy version of RELU, which can then override the gradient operation using the g.gradient_override_map context manager .

  • Collect the values ​​of the inverse / inverse of the BackProp and apply the threshold value for those associated with Relus.

I was unable to complete both approaches because they require some knowledge of the internal TF functions that I currently do not have.

Can someone suggest any other route or jot down the code?

Many thanks.

+7
tensorflow backpropagation
source share
2 answers

tf.gradients has a grad_ys parameter that can be used for this purpose. Suppose your network has only one relu layer as follows:

 before_relu = f1(inputs, params) after_relu = tf.nn.relu(before_relu) loss = f2(after_relu, params, targets) 

First, calculate the derivative before after_relu .

 Dafter_relu = tf.gradients(loss, after_relu)[0] 

Then set the thresholds of your gradients that you send.

 Dafter_relu_thresholded = tf.select(Dafter_relu < 0.0, 0.0, Dafter_relu) 

Compute actual wrt gradients to params .

 Dparams = tf.gradients(after_relu, params, grad_ys=Dafter_relu_thresholded) 

You can easily extend the same method for a network with many relu layers.

+5
source share

Better solution (your approach 1) with ops.RegisterGradient and tf.Graph.gradient_override_map . Together, they override the calculation of the gradient for a predetermined Op, for example. Relu in the context of gradient_override_map , using only python code.

 @ops.RegisterGradient("GuidedRelu") def _GuidedReluGrad(op, grad): return tf.where(0. < grad, gen_nn_ops._relu_grad(grad, op.outputs[0]), tf.zeros(grad.get_shape())) ... with g.gradient_override_map({'Relu': 'GuidedRelu'}): y = tf.nn.relu(x) 

here is a complete example of a managed relu implementation: https://gist.github.com/falcondai/561d5eec7fed9ebf48751d124a77b087

Update : in Tensorflow> = 1.0, tf.select renamed to tf.where . I updated the snippet accordingly. (Thanks @sbond for catching my attention :)

+5
source share

All Articles