Well, you seem pretty confused about a few things. Let's start from the beginning: you mentioned the "multidimensional function", but then discuss the usual one-parameter Gaussian curve. This is not a multidimensional function: when you integrate it, you combine only one variable (x). It’s important to make a distinction because there is a monster called the “multidimensional Gaussian distribution” that is a true multidimensional function and, if integrated, requires integration over two or more variables (which uses the expensive Monte Carlo technique that I mentioned earlier) . But you seem to be just talking about a regular single-variable Gaussian, which is much easier to work with, integrate and all that.
The one-parameter Gaussian distribution has two parameters: sigma and mu and is a function of one variable, which we will denote x . You also carry the normalization parameter n (which is useful in several applications). Normalization parameters are usually not included in the calculations, since you can simply peel them back at the end (remember that integration is a linear operator: int(n*f(x), x) = n*int(f(x), x) ) But we can wear it if you want; the designation that I like for normal distribution, then
N(x | mu, sigma, n) := (n/(sigma*sqrt(2*pi))) * exp((-(x-mu)^2)/(2*sigma^2))
(read that as "the normal distribution of x given by sigma , mu , and n given ...") So far, so good; this corresponds to the function you received. Note that the only true variable here is x : the other three parameters are fixed for any particular Gaussian.
Now for the mathematical fact: it is proved that all Gauss curves have the same shape, they are slightly shifted. Therefore, we can work with N(x|0,1,1) , called the "standard normal distribution", and simply transfer our results back to a common Gaussian curve. So, if you have an integral of N(x|0,1,1) , you can trivially calculate the integral of any Gaussian. This integral arises so often that it has a special name: the erf error function. Due to some old conventions, this is not exactly erf ; there are a couple of additive and multiplicative factors that are also tolerated.
If Phi(z) = integral(N(x|0,1,1), -inf, z) ; those. Phi(z) is the integral of the standard normal distribution from minus infinity to z , then it is true by the definition of the error function, which
Phi(z) = 0.5 + 0.5 * erf(z / sqrt(2)) .
Similarly, if Phi(z | mu, sigma, n) = integral( N(x|sigma, mu, n), -inf, z) ; those. Phi(z | mu, sigma, n) is an integral from the normal distribution of the given parameters mu , sigma and n from minus infinity to z , then this is true by the definition of the error function, which
Phi(z | mu, sigma, n) = (n/2) * (1 + erf((x - mu) / (sigma * sqrt(2)))) .
Check out the Wikipedia article on regular CDF if you want more information or evidence of this fact.
Well, that should be enough background. Return to your (edited) post. You say: "erf (z) in scipy.special would require me to pinpoint exactly what t is originally." I have no idea what you mean by that; where t (time?) generally comes into this? Hopefully the above explanation has slightly changed the error function, and now it’s clear why the error function is the right function for the job.
Your Python code is fine, but I would prefer closure over lambda:
def make_gauss(N, sigma, mu): k = N / (sigma * math.sqrt(2*math.pi)) s = -1.0 / (2 * sigma * sigma) def f(x): return k * math.exp(s * (x - mu)*(x - mu)) return f
Using closure allows for preliminary calculation of the constants k and s , so the returned function will need to do less work each time it is called (which may be important if you integrate it, which means that it will be many times). In addition, I avoided using the exposure operator ** , which is slower than just writing the square, and pulled the delimiter from the inner loop and replaced it with multiplication. I did not look at their implementation in Python, but from my last time, setting the inner loop to pure speed using the raw build x87, I seem to remember that adding, subtracting or multiplying takes about 4 processor cycles each, divides 36. and raising to a power of about 200. That was a couple of years ago, so take these numbers with salt; However, this illustrates their relative complexity. Also, computing the exp(x) brute force method is a very bad idea; there are tricks you can take when writing a good exp(x) implementation that make it much faster and more accurate than wrapping a**b style.
I have never used the numpy version of the constants pi and e; I have always followed simple versions of a math module. I do not know why you may prefer one of them.
I'm not sure what you are going to call quad() . quad(gen_gauss, -inf, inf, (10,2,0)) should integrate the renormalized Gaussian from minus infinity to plus infinity and should always spit out 10 (your normalization coefficient), since Gaussian integrates into 1 over the real line. Any answer away from 10 (I would not expect exactly 10, since quad() is just an approximation, after all) means that something is screwed somewhere ... it's hard to say what was introduced without knowing the actual the return value and possibly the inner workings of quad() .
Hope this demystified some confusion and explained why the error function is the correct answer to your problem, and also how to do it all yourself if you are interested. If any of my explanations was not clear, I suggest first looking at Wikipedia; If you have any questions, feel free to ask.