The TF Map function supports concurrent calls . I do not see improvements passing num_parallel_calls to display. With values num_parallel_calls=1 and num_parallel_calls=10 , the performance runtime does not improve. Here is a simple code
import time def test_two_custom_function_parallelism(num_parallel_calls=1, batch=False, batch_size=1, repeat=1, num_iterations=10): tf.reset_default_graph() start = time.time() dataset_x = tf.data.Dataset.range(1000).map(lambda x: tf.py_func( squarer, [x], [tf.int64]), num_parallel_calls=num_parallel_calls).repeat(repeat) if batch: dataset_x = dataset_x.batch(batch_size) dataset_y = tf.data.Dataset.range(1000).map(lambda x: tf.py_func( squarer, [x], [tf.int64]), num_parallel_calls=num_parallel_calls).repeat(repeat) if batch: dataset_y = dataset_x.batch(batch_size) X = dataset_x.make_one_shot_iterator().get_next() Y = dataset_x.make_one_shot_iterator().get_next() with tf.Session() as sess: sess.run(tf.global_variables_initializer()) i = 0 while True: try: res = sess.run([X, Y]) i += 1 if i == num_iterations: break except tf.errors.OutOfRangeError as e: pass
Below are the timings
%timeit test_two_custom_function_parallelism(num_iterations=1000, num_parallel_calls=2, batch_size=2, batch=True) 370ms %timeit test_two_custom_function_parallelism(num_iterations=1000, num_parallel_calls=5, batch_size=2, batch=True) 372ms %timeit test_two_custom_function_parallelism(num_iterations=1000, num_parallel_calls=10, batch_size=2, batch=True) 384ms
I used %timeit on a Juypter laptop. What am I doing wrong?
tensorflow tensorflow-datasets
Kracekumar
source share