This issue addresses the general issue of learning on several large files in Keras that are too large to fit on GPU memory. I am using Keras 1.0.5 and I need a solution that does not require 1.0.6. One way to do this was described by fchollet here and here :
# Create generator that yields (current features X, current labels y) def BatchGenerator(files): for file in files: current_data = pickle.load(open("file", "rb")) X_train = current_data[:,:-1] y_train = current_data[:,-1] yield (X_train, y_train)
However, I am afraid that the state of the model will not be saved, rather, that the model is reinitialized not only between eras, but also between data sets. Each 1/1 Age represents training in a different dataset below:
~~~~~ Age 0 ~~~~~~
Age 1/1 295806/295806 [===============================] 13 s - loss: 15.7517
Age 1/1 407890/407890 [===============================] - 19s - loss: 15.8036
Age 1/1 383188/383188 [======================================== 19 - loss : 15.8130
~~~~~ Age 1 ~~~~~~
Age 1/1 295806/295806 [===============================] - 14s - loss: 15.7517
Age 1/1 407890/407890. Age 1/1 383188/383188.
I know that you can use model.fit_generator, but since the method above has been repeatedly proposed as a way of batch training, I would like to know what I'm doing wrong.
Thanks for your help,
Max
source share