
The architecture of the cnn + lstm model will look like this: Basically you need to create a distributed distributed wrapper for the CNN layer, and then transfer the CNN output to the LSTM layer
cnn_input= Input(shape=(3,200,100,1)) #Frames,height,width,channel of imafe conv1 = TimeDistributed(Conv2D(32, kernel_size=(50,5), activation='relu'))(cnn_input) conv2 = TimeDistributed(Conv2D(32, kernel_size=(20,5), activation='relu'))(conv1) pool1=TimeDistributed(MaxPooling2D(pool_size=(4,4)))(conv2) flat=TimeDistributed(Flatten())(pool1) cnn_op= TimeDistributed(Dense(100))(flat)
After that you can pass your CNN output to LSTM
lstm = LSTM(128, return_sequences=True, activation='tanh')(merged) op =TimeDistributed(Dense(100))(lstm) fun_model = Model(inputs=[cnn_input], outputs=op)
remember that the entrance was distributed at this time. CNN should be (# of frames, row_size, column_size, channels)
Finally, you can apply softmax at the last level to get some predictions.
source share