Is there a way to crop intermediate blurry gradients in a tensor stream

Question

Problem: Very Long RNN Network

N1 -- N2 -- ... --- N100

For an optimizer such as AdamOptimizer , compute_gradient() will give gradients to all learning variables.

However, it can explode at a certain step.

But how to copy these intermediate?

One way could be to manually make a backprop with “N100 → N99”, click on the gradients, then “N99 → N98”, etc., but it's too complicated.

So my question is: is there an easier way to crop intermediate gradients? (of course, strictly speaking, they are no longer mathematical gradients)

+6

user1441268 Oct 12 '16 at 9:54

1 answer

Alexandre Passos · Answer 1 · 2017-10-27T18:30:37+0000

You can use the custom_gradient decorator to make a version of tf.identity that will copy intermediate blurry gradients.

`` `from tensorflow.contrib.eager.python import tfe

@ tfe.custom_gradient def gradient_clipping_identity (tensor, max_norm): result = tf.identity (tensor)

def grad (dresult): return tf.clip_by_norm (dresult, max_norm), None

return result, grad,,,

Then use gradient_clipping_identity , as you usually use the identifier, and your gradients will be clipped in the backward pass.