How to change softmax exit temperature in Keras

I am currently trying to reproduce the results of the following article.
http://karpathy.imtqy.com/2015/05/21/rnn-effectiveness/
I am using Keras with theano backend. In the article, he talks about controlling the temperature of the final softmax layer to obtain different outputs.

Temperature. We can also play with Softmax temperature during sampling. Reducing the temperature from 1 to number (for example, 0.5) makes the RNN more confident, but also more conservative in its samples. Conversely, higher temperatures are more diverse, but at the cost of more errors (e.g. spelling errors, etc.). In particular, setting the temperature very close to zero would most likely give Paul Graham the answer:

My model is as follows.

model = Sequential() model.add(LSTM(128, batch_input_shape = (batch_size, 1, 256), stateful = True, return_sequences = True)) model.add(LSTM(128, stateful = True)) model.add(Dropout(0.1)) model.add(Dense(256, activation = 'softmax')) model.compile(optimizer = Adam(), loss = 'categorical_crossentropy', metrics = ['accuracy']) 

The only way I can adjust the temperature of the final Dense layer is to get the weight matrix and multiply it by the temperature. Does anyone know how best to do this? Also, if someone sees something wrong with how I configure the model, let me know, as I am new to RNN.

+8
python theano neural-network keras softmax
source share
2 answers

Well, it looks like temperature is what you do with the release of the softmax layer. I found this example.

https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py

It uses the following function to select the soft-max output.

 def sample(a, temperature=1.0): # helper function to sample an index from a probability array a = np.log(a) / temperature a = np.exp(a) / np.sum(np.exp(a)) return np.argmax(np.random.multinomial(1, a, 1)) 
+7
source share

The answer from @ chasep255 works fine, but you get warnings due to log (0). You can simplify the operation e ^ log (a) / T = a ^ (1 / T) and get rid of log

 def sample(a, temperature=1.0): a = np.array(a)**(1/temperature) p_sum = a.sum() sample_temp = a/p_sum return np.argmax(np.random.multinomial(1, sample_temp, 1)) 

Hope this helps!

0
source share

All Articles