I am trying to speed up pymc3 fetching using parallelization, and I see only modest benefits.
I was able to reduce the total runtime from 25 minutes (njobs = 1) to 13 minutes (njobs = 6) on the i7 MacBook Pro. Due to the fact that it takes about 4 minutes to start sampling before pymc starts sampling, the increase is relatively small.
Question: does anyone successfully use the GPU with pymc3 and how much can I get for models that take 6-8 minutes? (My MacBook has an nvidia GT 750M 2Gb)
source
share