According to this , there are several ways to read data in TensorFlow.
The easiest is to submit your data through placeholders. When using fillers, the responsibility for shuffling and dosing is your responsibility.If you want to delegate shuffling and batch processing to the framework, you need to create an input pipeline. The problem is how you enter lmdb data into the symbolic input pipeline. A possible solution is to use an operation tf.py_func. Here is an example:
def create_input_pipeline(lmdb_env, keys, num_epochs=10, batch_size=64):
key_producer = tf.train.string_input_producer(keys,
num_epochs=num_epochs,
shuffle=True)
single_key = key_producer.dequeue()
def get_bytes_from_lmdb(key):
with lmdb_env.begin() as txn:
lmdb_val = txn.get(key)
example = get_example_from_val(lmdb_val)
label = get_label_from_val(lmdb_val)
return example, label
single_example, single_label = tf.py_func(get_bytes_from_lmdb,
[single_key], [tf.float32, tf.float32])
batch_examples, batch_labels = tf.train.batch([single_example, single_label],
batch_size)
return batch_examples, batch_labels
The operator tf.py_funcinserts a call to regular Python code inside the TensorFlow graph, we need to specify the inputs and the number and types of outputs. tf.train.string_input_producercreates a shuffled queue with the specified keys. The operator tf.train.batchcreates another queue containing data lots. During training, each assessment batch_examplesor batch_labelscancels the next batch from this queue.
, QueueRunner . ( TensorFlow):
init_op = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init_op)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
try:
while not coord.should_stop():
sess.run(train_op)
except tf.errors.OutOfRangeError:
print('Done training -- epoch limit reached')
finally:
coord.request_stop()
coord.join(threads)
sess.close()