Python theano.scan indicates an argument

I am desperate to understand the taps argument in the anano.scan function. Unfortunately, I am not able to ask a specific question.

I just do not understand the mechanism of "cranes". Okay, I'm fine. I know in what order the sequences are passed to the functions, but I don't know the meaning. For example (I took this code from another Python function question - Theano scan () ):

import numpy as np import theano import theano.tensor as T def addf(a1,a2): print(a1) print(a2) return a1+a2 i = T.iscalar('i') x0 = T.ivector('x0') step= T.iscalar('step') results, updates = theano.scan(fn=addf, outputs_info=[dict(initial=x0, taps=[-3])], non_sequences=step, n_steps=i) f=theano.function([x0, step,i],results) input = [2, 3] print(f(input, 2, 20)) 

Setting taps to -1 makes sense to me. As far as I understand, this is the same as not setting the value of the taps and the whole vector 'x0' is passed to the addf function. x0 will be added with the parameter "step" (int 2, which will be broadcast to the same size). In the next iteration, the result [4, 5] will be the input and so on which gives the following output:

 [[ 4 5] [ 6 7] [ 8 9] [10 11] [12 13] [14 15] [16 17] [18 19] [20 21] [22 23] [24 25] [26 27] [28 29] [30 31] [32 33] [34 35] [36 37] [38 39] [40 41] [42 43]] 

Setting bends to -3, however, gives the following result:

 [ 5 2 6 7 4 8 9 6 10 11 8 12 13 10 14 15 12 16 17] 

I have no explanation of how the scan function creates this output. Why is this just a list? "Print (a1)" turns out to be expected

 x0[t-3] 

Although I know that this is the value that a1 should have, I don’t know how to interpret this. What is the t-3rd value of x0? The anano documentation doesn't seem to be fully described in the taps argument ... so hopefully one of you guys will be.

thanks

+7
python theano
source share
1 answer

To better understand the use of taps , you must first understand how scan uses the outputs_info argument outputs_info general and how the provided values ​​for it ( initial , to be precise) change the nature of the result.

scan expects you to specify the type of output that you expect from this operation (unless, of course, you have any initial values, and just specify None , in which case it will start the first round { step }, and the output will not be sent as a parameter in fn in subsequent rounds).

So scan used to iterate over the provided sequences . This means that with step n (and without taps specified for sequences or outputs_info ), this fn will be applied to the nth elements of each of the sequences along with the output (s) generated by the previous (n-1 th) step . Therefore, the default value of taps for sequences is 0 , and for outputs_info is -1 .

Another way to look at this would be to look at all the sequences consisting of slices in their first dimension. Thus, for a certain step, the current slice of sequence(s) and the output slice of the previous step are passed to fn , and the calculated output is added to the results as a new slice, which will then be used for the next step . Obviously, each of the output slices will have the same shape. And if you provide an initial slice as part of outputs_info , then it should also have the same form as when creating the fn application. In your example, if output_info=[dict(initial=x0)] , it will take [2, 3] as the first slice and use it for the first step as the argument a1 to addf .

But often when processing signals (and elsewhere) you need more than just the last data points in time as causal information. Here I used time as a way of representing steps . Anyway, here taps is useful and helps to specify exactly which data points from sequences and results should be used for the current step . In your example, this means that for the current step third last output should be passed to fn .

And here you need to be careful when describing initial for outputs_info . Since scanning first breaks the initial value into slices according to the size of the first. Then the first fragment from this set of slices will be considered the earliest fragment (the third in your example) required to calculate the output of the first step .. p>

Suppose in your example taps=[-2] and input = [2, 3] . In this case, the scan will split the input into slices and use the first slice (here the value is 2) as an argument a1 - addf . The resulting value of 4 will be added to the output, and for the next step, the slices will include [2, 3, 4], of which the value 3 is in the second last (-2) tap. And so on. However, with taps=[-3] and the same input , there is no single value that resembles the saying that you collected values ​​at times (t-3) and (t-2), but did not collect the value in (t- one).

So, if you think that your output has a certain shape, and you need several output taps beyond -1, then the initial value should be a list of elements of the desired output form and have exactly as many elements as you need to extract the earliest slice.

TL; DR: In your example, if you want to get 2d vectors from each step and use taps=[-3] , then input should be a list of 3 such 2d vectors. If you want to get unambiguous results, then input should be a list with 3 integers. A list of 2 integers makes no sense at all in this context. It would be reasonable if taps is -2 or -1 or [-2, -1] .

+7
source share

All Articles