I used Tensorflow (processor version) for my Deep Learning model. In particular, using DNNRegressor Estimator for training with a given set of parameters (network structure, hidden layers, alpha, etc.). Although I was able to reduce the loss, the model took a very long time to learn (about 3 days) and the take time was 9 seconds per 100th step.

I came to this article: https://medium.com/towards-data-science/how-to-traine-tensorflow-models-79426dabd304 and found that the GPU could be faster to learn. So, I took p2.xlarge gpu from AWS (single core GPU)
with 4 (vCPU), 12 (ECU) and 61 (MiB).
But the learning speed is 9 s per 100th step. I use the same code that I used for the Appraisers on the CPU because I read that the Appraisers use the GPU on their own. Here is my output from the nvidia-smi command. 
- Indicates that GPU memory is being used, but my Volatile GPU-Util is 1%. Unable to understand what I'm missing. Is it intended to work the same way, or am I missing something because the global steps per second are the same for the CPU and GPU implementation of the Appraisers.
- Should I explicitly change something in the DNNRegressor evaluator code?
python neural-network tensorflow tensorflow-gpu
user3457384
source share