There are two different types of "memory leak" in your question.
Valgrind will tell you about the first type of memory leak. However, for python modules, it is quite difficult to use a memory leak - basically these are some global variables that are allocated / initialized when the module loads. And since the module is loaded only once in Python, this is not a big problem.
A well-known example: numpy PyArray_API : it must be initialized via _import_array , then it is not deleted and remains in memory until the python interpreter is closed.
Thus, this is a “memory leak” for each design, you can say whether it is a good design or not, but at the end of the day you can do nothing about it.
I don't have enough information about the tensorflow module to indicate where memory leaks occur, but I'm sure you have nothing to worry about.
The second "memory leak" is more subtle.
You can get an advantage when you compare valgrind output for 10^4 and 10^5 iterations of the loop - there will be almost no difference! However, there is a difference in peak memory consumption.
Unlike C ++, Python has a garbage collector - so you cannot know exactly when an object is destroyed. CPython uses reference counting, so when the reference count gets 0, the object is destroyed. However, when there is a link loop (for example, object A contains a link to object B , and object B contains a link to object B ), this is not so simple: the garbage collector needs to iterate through all the objects to find such unused loops.
You might think that keras.layers.Input has such a loop somewhere (and this is true), but this is not the reason for this "memory leak", which can be observed for pure python.
We use the objgraph package to check the links, let it run the following python script:
#pure.py from keras.layers import Input import gc import sys import objgraph def foo(param): a = Input(shape=(1280,)) return "str"
and run it:
>>> python pure.py 1000
We can see the following: at the end there are exactly 1000 Tersors, which means that not one of our created objects has been deleted!
If we look at a chain that supports a live tensor object (created using objgraph.show_chain ), we see:

that there is a tensor flow-Graph-object where all the tensors are registered and remain there until session is closed.
So far, theory has, however, been the case:
#close session and free resources: import keras keras.backend.get_session().close()#free all resources print("\n\n\n Counts after session.close():") objgraph.show_most_common_types()
and here is the proposed solution:
with tf.Graph().as_default(), tf.Session() as sess: for step in range(int(sys.argv[1])): foo(" ")
worked for the current version of tensorflow. This is probably a mistake .
In short: you are not doing anything wrong with your C ++ code, there are no memory leaks for which you are responsible. In fact, you will see exactly the same memory consumption if you called the foo function again from a pure python script.
All created tensors are registered in the Graph object and are not automatically released; you must free them by closing the backend session, which, however, does not work due to an error in the current tensorflow file version 1.4.0.