How to load a matrix to change the level of attention in the seqToseq demo? - paddle

When trying to reproduce section 3.1 in the inclusion of discrete translation vocabulary in Neural MT in the paddle-paddle

I tried to create a static matrix that I would need to load into the seqToseq preparation seqToseq , for example:

 >>> import numpy as np >>> x = np.random.rand(3,2) >>> x array([[ 0.64077103, 0.03278357], [ 0.47133411, 0.16309775], [ 0.63986919, 0.07130613]]) # where there is 3 target words and 2 source words, # and each cell in the matrix represents some co-occurrence probabilities. 

With the demonstration of seqToseq_net this matrix will need to be multiplied by the output of the attention level in gru_decoder_with_attention . Original demo:

 def gru_decoder_with_attention(enc_vec, enc_proj, current_word): decoder_mem = memory(name='gru_decoder', size=decoder_size, boot_layer=decoder_boot) # This attention context layer would have been # a vector of size |src_vocab| x 1 context = simple_attention(encoded_sequence=enc_vec, encoded_proj=enc_proj, decoder_state=decoder_mem, ) with mixed_layer(size=decoder_size * 3) as decoder_inputs: decoder_inputs += full_matrix_projection(input=context) decoder_inputs += full_matrix_projection(input=current_word) gru_step = gru_step_layer(name='gru_decoder', input=decoder_inputs, output_mem=decoder_mem, size=decoder_size) with mixed_layer(size=target_dict_dim, bias_attr=True, act=SoftmaxActivation()) as out: out += full_matrix_projection(input=gru_step) return out 

The goal is to influence the level of attention by multiplying it by a static matrix:

 def gru_decoder_with_attention(enc_vec, enc_proj, current_word): decoder_mem = memory(name='gru_decoder', size=decoder_size, boot_layer=decoder_boot) # This attention context layer would have been # of size |src_vocab| x 1 context = simple_attention(encoded_sequence=enc_vec, encoded_proj=enc_proj, decoder_state=decoder_mem, ) # This static matrix layer, x, would have been # of size |trg_vocab| x |src_vocab| static_matrix = some_sort_of_layer(x) # This should yield a vector of size # |trg_vocab| x 1 static_matrix_multiply_context = some_sort_of_operation_layer( static_matrix, context) with mixed_layer(size=decoder_size * 3) as decoder_inputs: # decoder_inputs += full_matrix_projection(input= static_matrix_multiply_context) decoder_inputs += full_matrix_projection(input=current_word) 

I tried to view the code in Paddle/python/trainer_config_helps and went through all the demo code and I also asked the PaddlePaddle gitter . But I can’t find how to load a custom static matrix that does not need to be updated in the learning process and interact with one of the Paddle layers.

How to load matrix for changing attention level in seqToseq demo?

What should be some_sort_of_layer and some_sort_of_operation_layer in the above example?

+6
source share

All Articles