Stirling's Approximation
Introduction
Stirling's approximation for approximating factorials is given by the following equation. \[ \ln(N!)\sim N\ln N - N + \frac{1}{2}\ln(2\pi N) \] I've seen lots of "derivations" of this, but most make a hand-wavy argument to get you to the first two terms, but only the full-blown derivation I'm going to work through will offer that third term, and also provides a means of getting additional terms. This derivation is also worked through on Wikipedia, but what I hope to offer goes slower and skips fewer steps, and is (hopefully) mathematically rigorous.
Gamma Function
The first step is to turn $N!$ into a function of a continuous variable, rather than a discrete variable. The trick is a function called the gamma function, which I will state without "proof" and will rather show that it behaves like we want. Let's consider the expression: \[ \int_0^{\infty} t^x e^{-t} dt\] To demonstrate that it does what we want, suppose $x=n \in \mathbb{Z}$, and we integrate by parts. \[ u=t^{n}~~~~~~du=n t^{n-1}dt~~~~~~v=-e^{-t}~~~~~~dv=e^{-t}dt\] \[ \int_0^{\infty} u dv = \left. uv\frac{}{}\right|_{0}^{\infty} - \int_0^{\infty} v du\] \[ \int_0^{\infty} t^n e^{-t} dt = \left. -t^n e^{-t}\right|_0^{\infty} + \int_0^{\infty} n t^{n-1} e^{-t} dt\] For the boundary term, we note that at $t=0$, the $t^n$ term dominates, and therefore disappears on this bound, and that as $t\rightarrow\infty$, a quick L'Hopital's rule demonstrates that $\lim_{t\rightarrow\infty} e^{-t} t^{n} = 0$ for any finite $n$. Therefore we have: \[ \int_0^{\infty} t^n e^{-t} dt = n \left(\int_0^{\infty} t^{n-1} e^{-t} dt\right)\]The piece in the parentheses on the RHS is simply the same as what we started with but with a slightly different argument in the exponent $(n-1)$. Therefore, we can essentially do this operation recursively, each time decreasing the argument by 1. Since we assumed $n\in\mathbb{Z}$, that means this recursion process is actually finite. Let's operate $n$ times, which will give us: \[ \int_0^{\infty} t^n e^{-t} dt = (n)(n-1)(n-2)...(n-(n-2))(n-(n-1))\int_0^{\infty} t^{n-(n)} e^{-t} dt\]
Since we know $\int_0^{\infty} e^{-t} dt = 1$ just by quickly integrating and plugging in the bounds, this means, for integer arguments: \[ n! = (n)(n-1)(n-2)...(2)(1) = \int_0^{\infty} t^n e^{-t} dt\] Therefore, we can simply rephrase our problem of approximating the factorial as actually being a problem of approximating this integral. I should also mention that it's possible (I'm not sure really) that the convention to make $0!=1$ may come from this definition of the factorial. Anyway, we've managed to "equate" a function of a discrete variable to a function of a continuous variable, so if we can approximate the continuous variable case, then the discrete approximation can be treated simply as a special case. Also, I just want to remind the reader that the gamma function is not an approximation, but is exactly equal to the factorial, and is even more general. That said, let's move on to approximating this gamma function.
Since we know $\int_0^{\infty} e^{-t} dt = 1$ just by quickly integrating and plugging in the bounds, this means, for integer arguments: \[ n! = (n)(n-1)(n-2)...(2)(1) = \int_0^{\infty} t^n e^{-t} dt\] Therefore, we can simply rephrase our problem of approximating the factorial as actually being a problem of approximating this integral. I should also mention that it's possible (I'm not sure really) that the convention to make $0!=1$ may come from this definition of the factorial. Anyway, we've managed to "equate" a function of a discrete variable to a function of a continuous variable, so if we can approximate the continuous variable case, then the discrete approximation can be treated simply as a special case. Also, I just want to remind the reader that the gamma function is not an approximation, but is exactly equal to the factorial, and is even more general. That said, let's move on to approximating this gamma function.
Rescaling Variables
Next, I repeat the gamma function here, making note that $x^N=\exp\left({\ln(x^{N})}\right)=e^{N \ln x}$.
\[ N! = \int_0^{\infty} x^N e^{-x} dx = \int_0^{\infty} e^{(N \ln x) - x} dx \] In a class called Asymptotic Analysis, I learned a technique called "Laplace's Method" to handle integrals of exponentials with asymptotically large arguments. The textbook I used was "Applied Asymptotic Analysis" by Peter Miller, which does a great job with proving and giving examples of Laplace's Method (and other asymptotic techniques).
The trick with these integrals is to first make sure that, for increasing $N$, that the quantity inside the exponential is decreasing (and in particular, negative, so as to encourage convergence to zero). That said, for the integral we've got right now, for large positive $N$, $\ln(x)$ is both positive and monotonically increasing, meaning the integral as is will diverge. However, we do have the $-x$ term in the exponential which is negative, and grows faster than $\ln(x)$, but it doesn't have an $N$ in front. But, we can change variables in such a manner that both the $x$ and $\ln(x)$ scale with the same value of N. Let's try a change of variables such that $x=N^p z$, where $p$ is some parameter that will help us to match these together. Note this isn't a clever change of variables like a trig substitution or something, this is essentially just us (linearly) rescaling the $x$ dimension in such a way that it grows (or shrinks) with choice of $N$. Since $x$ goes from $0$ to $\infty$ anyway, this rescaling leaves the actual integration completely unchanged. \[\int_0^{\infty} \exp\left(N\ln(N^p z)-N^p z\right) N^p dz\] \[= N^p \exp(pN\ln N) \int_{z=0}^{z=\infty} \exp\left( N\ln z - N^p z\right) dz\] Therefore, we can see that if we choose $p=1$, then we can "match" $\ln(z)$ to $z$. The result is:
\[ N! = N \exp(N\ln N) \int_0^{\infty}\exp\left(N(\ln z -z)\right) dz\]
Now, let's look at that quantity inside the exponential. What does $\ln z - z$ look like?
\[ N! = \int_0^{\infty} x^N e^{-x} dx = \int_0^{\infty} e^{(N \ln x) - x} dx \] In a class called Asymptotic Analysis, I learned a technique called "Laplace's Method" to handle integrals of exponentials with asymptotically large arguments. The textbook I used was "Applied Asymptotic Analysis" by Peter Miller, which does a great job with proving and giving examples of Laplace's Method (and other asymptotic techniques).
The trick with these integrals is to first make sure that, for increasing $N$, that the quantity inside the exponential is decreasing (and in particular, negative, so as to encourage convergence to zero). That said, for the integral we've got right now, for large positive $N$, $\ln(x)$ is both positive and monotonically increasing, meaning the integral as is will diverge. However, we do have the $-x$ term in the exponential which is negative, and grows faster than $\ln(x)$, but it doesn't have an $N$ in front. But, we can change variables in such a manner that both the $x$ and $\ln(x)$ scale with the same value of N. Let's try a change of variables such that $x=N^p z$, where $p$ is some parameter that will help us to match these together. Note this isn't a clever change of variables like a trig substitution or something, this is essentially just us (linearly) rescaling the $x$ dimension in such a way that it grows (or shrinks) with choice of $N$. Since $x$ goes from $0$ to $\infty$ anyway, this rescaling leaves the actual integration completely unchanged. \[\int_0^{\infty} \exp\left(N\ln(N^p z)-N^p z\right) N^p dz\] \[= N^p \exp(pN\ln N) \int_{z=0}^{z=\infty} \exp\left( N\ln z - N^p z\right) dz\] Therefore, we can see that if we choose $p=1$, then we can "match" $\ln(z)$ to $z$. The result is:
\[ N! = N \exp(N\ln N) \int_0^{\infty}\exp\left(N(\ln z -z)\right) dz\]
Now, let's look at that quantity inside the exponential. What does $\ln z - z$ look like?

As we can see in the graph to the right, $\ln z - z$ takes on a local maximum, and goes to $-\infty$ in both integration limits. Therefore, considering we are integrating $\exp(N(\ln z-z))$, for large N, the points far away from the maximum will take on exponentially small values, meaning the bulk of the contribution of this integral will occur near this maximum, and points far away from the maximum can be neglected. Of course, this is somewhat hand-wavy, but that textbook earlier definitely helps give this mathematical rigor. So first, let's find where this maximum occurs.
\[ \frac{d}{dz}\left(\ln z-z\right)=0 ~~~~\rightarrow~~~~ \frac{1}{z_{max}}-1=0 ~~~~~\rightarrow ~~~~ z_{max}=1\]
To confirm this is a maximum, let's take a second derivative and make sure it's negative.
\[ \frac{d^2}{dz^2}\left(\ln z-z\right) = -\frac{1}{z^2_{max}} = -1 <0\]
Which confirms this is a local maximum. Therefore, let's change variables again, such that $t=z-1$, and since the maximum of the function is $-1$, let's also pull this out of the integral. The objective in doing this is so that that maximum occurs at $t=0$, and takes on a value of $0$ at this location.
\[ N! = N \exp\left(N \ln N\right) \exp(-N) \int_{-1}^{\infty} \exp\left(N(\ln(t+1)-t\right) dt\]
Now, let's Taylor expand at this maximum to see how strongly it drops off in either direction.
\[ N! = N \exp\left(N\ln N -N\right) \int_{-1}^{\infty} \exp\left(-N\left(\frac{t^2}{2}-\frac{t^3}{3}+\frac{t^4}{4}-...\right)\right) dt\]
So at this point, I could try to make the argument that we can neglect the higher order terms of the Taylor series (since the farther $t$ is from $0$, the less of a contribution is made to the integral). But this isn't very rigorous, and also won't help us get higher order terms in the Stirling approximation. In order to get those, we must introduce a theorem from math called Implicit Function Theorem.
To confirm this is a maximum, let's take a second derivative and make sure it's negative.
\[ \frac{d^2}{dz^2}\left(\ln z-z\right) = -\frac{1}{z^2_{max}} = -1 <0\]
Which confirms this is a local maximum. Therefore, let's change variables again, such that $t=z-1$, and since the maximum of the function is $-1$, let's also pull this out of the integral. The objective in doing this is so that that maximum occurs at $t=0$, and takes on a value of $0$ at this location.
\[ N! = N \exp\left(N \ln N\right) \exp(-N) \int_{-1}^{\infty} \exp\left(N(\ln(t+1)-t\right) dt\]
Now, let's Taylor expand at this maximum to see how strongly it drops off in either direction.
\[ N! = N \exp\left(N\ln N -N\right) \int_{-1}^{\infty} \exp\left(-N\left(\frac{t^2}{2}-\frac{t^3}{3}+\frac{t^4}{4}-...\right)\right) dt\]
So at this point, I could try to make the argument that we can neglect the higher order terms of the Taylor series (since the farther $t$ is from $0$, the less of a contribution is made to the integral). But this isn't very rigorous, and also won't help us get higher order terms in the Stirling approximation. In order to get those, we must introduce a theorem from math called Implicit Function Theorem.
Implicit Function Theorem
Suppose you have an implicit function of two variables: \[ (1+x) (1+\tan x) = y+\ln(y)\] and suppose you wanted to take out of that $y(x)$. Well, try as you might, you won't be able to use any algebraic tricks to isolate an explicit $y(x)$. But that's not the end of the line. Implicit Function Theorem says that if you know a root (which for our case, we can see quickly that $x=0,y=1$ is a solution). Therefore, we can find $y(x)$ in the neighborhood of $(0,1)$. Implicit Function Theorem says that you can find a unique solution to $f(x,y(x))=0$ where $f(x,y)=y+\ln(y)-(1+x)(1+\tan x)$, as long as $\frac{\partial f}{\partial y}$ is non-zero when evaluated at $(0,1)$. Equivalently, if $\frac{\partial f}{\partial x}$ is non-zero at $(0,1)$, then $x(y)$ exists also. Then it's simply a matter of Taylor expanding $y(x)=y(0)+x y'(0)+\frac{1}{2}x^2 y''(0)+...$, where $y^{(n)}(x=0)$ is found through repeated (total) derivatives of $f(x,y(x))$. Instead of working out this problem to demonstrate how it's used, let instead work on the problem at hand.
So, what I would like to do is turn $\ln(t+1)-t$, which looks parabolic near $t=0$, and replace it with some other variable, let's say $-s^2$. By changing like this, the quadratic nature of $t$ is preserved (no change of coordinates would fix this problem anyway), but the advantage is that we know (or will know later), how to handle the integral of $\exp(-s^2)ds$ (a Gaussian). But in order to get $ds$, we will need to chain rule our existing $dt$, meaning we'll ultimately need $\frac{dt(s)}{ds}$, as a function of $s$, in order to proceed with the integral. Spelling this out more clearly, what we're looking for is a way to turn this LHS in this RHS:
\[ \int_{t=-1}^{\infty} \exp\left(N(\ln(t+1)-t)\right)dt = \int_{s=?}^{s=\infty} \frac{dt(s)}{ds} e^{-Ns^2} ds\]
So, we can pull this off as long as we know $t(s)$. Problem is, we don't even know if a unique solution exists. Here is where Implicit Function Theorem takes over. Defining a function, let's call it:
\[ f_0(s,t)\equiv \ln(1+t)-t+s^2 \equiv 0\]
The roots to the above equation $t(s)$ are therefore what we're looking for. Notably, since we're interested only in $t$ near $0$, then we note that the only solution for $s$ would also have to happen at $s=0$. Therefore, we know that $t(s=0)=0$ is a solution, and we will look to find what $t(s)$ looks like near this $(0,0)$ solution. If we take a partial derivative $\frac{\partial f(s,t)}{\partial t}$, and plug in at $(0,0)$, if what we get is non-zero, then we're all good to go! Checking this:
\[ \left(\frac{\partial f_0(t,s)}{\partial t} \right)_{(0,0)} = \left(\frac{1}{1+t} - 1\right)_{(0,0)} = 0\]
And we needed this quantity to be anything but zero. Therefore, we have to go back to the drawing board. The technique to handle this situation is called "blowing up of the singularity" (for reasons that I do not understand), but there are basically two tricks. First, let's make a substitution, $t(s)=s v(s)$. This means our new function of two variables is:
\[ f_1(v,s)\equiv \ln(1+sv)-sv+s^2\equiv 0\]
So, in order to get $v(s)$ defined uniquely, we need $\frac{\partial f_1(s,v)}{\partial v}\neq0$. Since we already know we want the region near $s=0$, taking this partial derivative also results in $0$. So now what? Well, before answering that, let's go back and look at what solution we're looking at - we know $s=0$, but what about $v$? You see quickly that any value for $v$ works, because $s$ and $v$ always show up together. So what if we instead consider the equation:
\[ f_2(v,s)\equiv \frac{\ln(1+sv)-sv+s^2}{s}\equiv0\]
In this equation, the root is still preserved, even though we divided by $s$ (think like this - what used to be parabolic in $s$ just became linear in $s$). So let's try taking our partial derivative here: $\frac{\partial f_2(s,v)}{\partial v}$ evaluated at $s=0$ gives zero again. And just like last time, if you try to determine $v(s=0)$ by plugging into $f_2(v(s=0),s=0)$, you will also come up with infinitely many solutions for $v$ - definitely no good.
So the second trick I mentioned before as part of "blowing up the singularity" is to divide your initial $f_0(s,t=sv)$ by the correct power of $s$ such that $f_0$ becomes "constant" (as opposed to linear or quadratic in $s$), and therefore forces the first derivative to become non-zero. This means we will define our implicit function as the root of the equation:
\[ f_3(s,v)\equiv\frac{\ln(1+sv)-sv+s^2}{s^2} = \frac{\ln(1+sv)-sv}{s^2}+1\equiv 0\]
Checking to make sure this is a legal maneuver, let's calculate $\frac{\partial f_3(s,v)}{\partial v}$ and plug in at $s=0$. Doing so gives the result $-v(0)$. We do not have free choice of $v(0)$ however - it is given by Taylor expanding our $f_3(s,v)$ function. Since we saw from before that $\ln(1+t)-t = -\frac{t^2}{2} + \frac{t^3}{3} - \frac{t^4}{4} +...$, we can plug in $t=sv(s)$ and solve. The algebra to do this is quite messy, so I'll simply write $v(s)$ in a power series: $v(s)=v_0 + s v_1 + s^2 v_2 +...$. The advantage here is that I can ask WolframAlpha to handle the algebra, and give me the series expansion. The result:
\[ f_3(s,v(s)) = \left(1 - \frac{v_0^2}{2}\right) + s\left( \frac{v_0}{3}(v_0^2 - 3v_1)\right) + s^2\left(-\frac{v_0^4}{4} + v_0^2 v_1 - v_0 v_2 - \frac{v_1^2}{2}\right) + O(s^3)\]
Therefore, at $s=0$, we can see that only if $v_0=\sqrt{2}$ do we satisfy $f(s,v(s))\equiv0$. This also means that $\frac{\partial f_3}{\partial v}=- \sqrt{2}\neq0$ is satisfied, and thus we have correctly chosen our implicit function.
Now, the main reason why you'd bother to go through this lengthy side discussion is so that you can find more terms in the series (and to also guarantee mathematical rigor). Let's therefore find $v'(0)$ and $v''(0)$, or equivalently, $v_1$ and $v_2$ as defined in my earlier power series. Instead of forcing $f_3(s,v(s))$ to zero and solving for $v_0$, let's now force $\frac{d f_3(s,v(s))}{ds}$ to zero and solve for $v_1$. Note that I wrote total derivative $\frac{d}{ds}$ and not $\frac{\partial}{\partial s}$, the latter of which would be used to guarantee the existence of $s(v)$, and the former is used to guarantee an even stronger root to $f(s,v(s))=0$. Since I already have the Taylor series of $f_3(s,v(s))$ spelled out from before, I can just sequentially set each coefficient to $0$ by solving for $v_1, v_2...$. Doing this gives:
\[ v_1 = \frac{v_0^2}{3} = \frac{2}{3}~~~~ v_2 =\frac{1}{v_0}\left( v_0^2 v_1 - \frac{v_0^4}{4} - \frac{v_1^2}{2}\right) = \frac{1}{9\sqrt{2}}\]
We could continue to get more terms by using a bigger power series and taking more terms in the $f_3(s,v(s))$ series, but I think the number of terms we have now is sufficient. To summarize:
\[ v(s) = \sqrt{2}+ \frac{2}{3} s +\frac{1}{9\sqrt{2}} s^2 + O(s^3) \]
\[ t(s)=sv(s) =\sqrt{2} s + \frac{2}{3} s^2 + \frac{1}{9\sqrt{2}}s^3 + O(s^4) \]
\[ \frac{dt(s)}{ds} = \sqrt{2} + \frac{4}{3} s + \frac{1}{3\sqrt{2}} s^2 + O(s^3) \]
So, what I would like to do is turn $\ln(t+1)-t$, which looks parabolic near $t=0$, and replace it with some other variable, let's say $-s^2$. By changing like this, the quadratic nature of $t$ is preserved (no change of coordinates would fix this problem anyway), but the advantage is that we know (or will know later), how to handle the integral of $\exp(-s^2)ds$ (a Gaussian). But in order to get $ds$, we will need to chain rule our existing $dt$, meaning we'll ultimately need $\frac{dt(s)}{ds}$, as a function of $s$, in order to proceed with the integral. Spelling this out more clearly, what we're looking for is a way to turn this LHS in this RHS:
\[ \int_{t=-1}^{\infty} \exp\left(N(\ln(t+1)-t)\right)dt = \int_{s=?}^{s=\infty} \frac{dt(s)}{ds} e^{-Ns^2} ds\]
So, we can pull this off as long as we know $t(s)$. Problem is, we don't even know if a unique solution exists. Here is where Implicit Function Theorem takes over. Defining a function, let's call it:
\[ f_0(s,t)\equiv \ln(1+t)-t+s^2 \equiv 0\]
The roots to the above equation $t(s)$ are therefore what we're looking for. Notably, since we're interested only in $t$ near $0$, then we note that the only solution for $s$ would also have to happen at $s=0$. Therefore, we know that $t(s=0)=0$ is a solution, and we will look to find what $t(s)$ looks like near this $(0,0)$ solution. If we take a partial derivative $\frac{\partial f(s,t)}{\partial t}$, and plug in at $(0,0)$, if what we get is non-zero, then we're all good to go! Checking this:
\[ \left(\frac{\partial f_0(t,s)}{\partial t} \right)_{(0,0)} = \left(\frac{1}{1+t} - 1\right)_{(0,0)} = 0\]
And we needed this quantity to be anything but zero. Therefore, we have to go back to the drawing board. The technique to handle this situation is called "blowing up of the singularity" (for reasons that I do not understand), but there are basically two tricks. First, let's make a substitution, $t(s)=s v(s)$. This means our new function of two variables is:
\[ f_1(v,s)\equiv \ln(1+sv)-sv+s^2\equiv 0\]
So, in order to get $v(s)$ defined uniquely, we need $\frac{\partial f_1(s,v)}{\partial v}\neq0$. Since we already know we want the region near $s=0$, taking this partial derivative also results in $0$. So now what? Well, before answering that, let's go back and look at what solution we're looking at - we know $s=0$, but what about $v$? You see quickly that any value for $v$ works, because $s$ and $v$ always show up together. So what if we instead consider the equation:
\[ f_2(v,s)\equiv \frac{\ln(1+sv)-sv+s^2}{s}\equiv0\]
In this equation, the root is still preserved, even though we divided by $s$ (think like this - what used to be parabolic in $s$ just became linear in $s$). So let's try taking our partial derivative here: $\frac{\partial f_2(s,v)}{\partial v}$ evaluated at $s=0$ gives zero again. And just like last time, if you try to determine $v(s=0)$ by plugging into $f_2(v(s=0),s=0)$, you will also come up with infinitely many solutions for $v$ - definitely no good.
So the second trick I mentioned before as part of "blowing up the singularity" is to divide your initial $f_0(s,t=sv)$ by the correct power of $s$ such that $f_0$ becomes "constant" (as opposed to linear or quadratic in $s$), and therefore forces the first derivative to become non-zero. This means we will define our implicit function as the root of the equation:
\[ f_3(s,v)\equiv\frac{\ln(1+sv)-sv+s^2}{s^2} = \frac{\ln(1+sv)-sv}{s^2}+1\equiv 0\]
Checking to make sure this is a legal maneuver, let's calculate $\frac{\partial f_3(s,v)}{\partial v}$ and plug in at $s=0$. Doing so gives the result $-v(0)$. We do not have free choice of $v(0)$ however - it is given by Taylor expanding our $f_3(s,v)$ function. Since we saw from before that $\ln(1+t)-t = -\frac{t^2}{2} + \frac{t^3}{3} - \frac{t^4}{4} +...$, we can plug in $t=sv(s)$ and solve. The algebra to do this is quite messy, so I'll simply write $v(s)$ in a power series: $v(s)=v_0 + s v_1 + s^2 v_2 +...$. The advantage here is that I can ask WolframAlpha to handle the algebra, and give me the series expansion. The result:
\[ f_3(s,v(s)) = \left(1 - \frac{v_0^2}{2}\right) + s\left( \frac{v_0}{3}(v_0^2 - 3v_1)\right) + s^2\left(-\frac{v_0^4}{4} + v_0^2 v_1 - v_0 v_2 - \frac{v_1^2}{2}\right) + O(s^3)\]
Therefore, at $s=0$, we can see that only if $v_0=\sqrt{2}$ do we satisfy $f(s,v(s))\equiv0$. This also means that $\frac{\partial f_3}{\partial v}=- \sqrt{2}\neq0$ is satisfied, and thus we have correctly chosen our implicit function.
Now, the main reason why you'd bother to go through this lengthy side discussion is so that you can find more terms in the series (and to also guarantee mathematical rigor). Let's therefore find $v'(0)$ and $v''(0)$, or equivalently, $v_1$ and $v_2$ as defined in my earlier power series. Instead of forcing $f_3(s,v(s))$ to zero and solving for $v_0$, let's now force $\frac{d f_3(s,v(s))}{ds}$ to zero and solve for $v_1$. Note that I wrote total derivative $\frac{d}{ds}$ and not $\frac{\partial}{\partial s}$, the latter of which would be used to guarantee the existence of $s(v)$, and the former is used to guarantee an even stronger root to $f(s,v(s))=0$. Since I already have the Taylor series of $f_3(s,v(s))$ spelled out from before, I can just sequentially set each coefficient to $0$ by solving for $v_1, v_2...$. Doing this gives:
\[ v_1 = \frac{v_0^2}{3} = \frac{2}{3}~~~~ v_2 =\frac{1}{v_0}\left( v_0^2 v_1 - \frac{v_0^4}{4} - \frac{v_1^2}{2}\right) = \frac{1}{9\sqrt{2}}\]
We could continue to get more terms by using a bigger power series and taking more terms in the $f_3(s,v(s))$ series, but I think the number of terms we have now is sufficient. To summarize:
\[ v(s) = \sqrt{2}+ \frac{2}{3} s +\frac{1}{9\sqrt{2}} s^2 + O(s^3) \]
\[ t(s)=sv(s) =\sqrt{2} s + \frac{2}{3} s^2 + \frac{1}{9\sqrt{2}}s^3 + O(s^4) \]
\[ \frac{dt(s)}{ds} = \sqrt{2} + \frac{4}{3} s + \frac{1}{3\sqrt{2}} s^2 + O(s^3) \]
One Last Change of Variables
So, the results from the last section give us:
\[ N! = N \exp\left(N \ln N-N\right) \int_{s=t^{-1}(-1)}^{\infty}\left(\sqrt{2} + \frac{4}{3} s + \frac{1}{3\sqrt{2}} s^2 + O(s^3)\right) \exp\left(-Ns^2\right) ds\]
Notice that the lower bound of the integral must be whatever $s$ value corresponds to $t=-1$, which I have written in terms of a sort of "implicit inverse function". This is not ideal. Let's return to our defining function $f_3(s,v(s))$ to see what this means.
\[ f_3(s,v(s)) = \frac{\ln(1+s v(s)) - sv(s)}{s^2} + 1 = \frac{\ln(1+t(s)) - t(s)}{s^2} + 1 = 0\]
Solving for $s$ gives:
\[ s^2 = -\lim_{t\rightarrow -1}\left[ \ln(1+t)-t \right]\]
Therefore, we get that $s\rightarrow \pm \infty$ in order to approach the $t=-1$ bound, and since we've already determined that $s\rightarrow\infty$ corresponds to $t\rightarrow\infty$, it must mean that the lower bound on the integral must be $-\infty$. This just happens to be the case for this problem, though most problems like this would require you to "commit exponentially small error", and extend the lower bound to $-\infty$. To stress that the lower bound doesn't mean much, I return to the explanation before that $\exp(-Ns^2)$, for large $N$, dies very quickly for $|s|>0$, so extending the lower bound from $t=-1$ out to $t=-\infty$ is a legal maneuver, especially since the error committed in this step would be smaller than error I'll commit later. And (debatably) for this problem, the change of variables makes the lower bound $-\infty$ anyway, so perhaps this doesn't matter.
Anyway, what we essentially have is:
\[ N! = N \exp\left(N\ln N - N\right) \sum_{k=0}^{\infty} a_k \int_{-\infty}^{\infty} s^k e^{-Ns^2} ds \]
Where $a_0=\sqrt{2}$, $a_1=\frac{4}{3}$, $a_2=\frac{1}{3\sqrt{2}}$, and $a_{k\geq 3}$ has not been found yet, but would be easy to find in principle.
So, to get rid of the $N$ dependence in the exponential, let's change coordinates to: $r=s\sqrt{N}$. This gives:
\[ N! = \sqrt{N} \exp\left(N\ln N - N\right) \sum_{k=0}^{\infty} \frac{a_k}{(\sqrt{N})^k} \int_{-\infty}^{\infty}r^k \exp(-r^2) dr\]
Now, if we could just evaluate that integral on the right, we'd be good to go!
\[ N! = N \exp\left(N \ln N-N\right) \int_{s=t^{-1}(-1)}^{\infty}\left(\sqrt{2} + \frac{4}{3} s + \frac{1}{3\sqrt{2}} s^2 + O(s^3)\right) \exp\left(-Ns^2\right) ds\]
Notice that the lower bound of the integral must be whatever $s$ value corresponds to $t=-1$, which I have written in terms of a sort of "implicit inverse function". This is not ideal. Let's return to our defining function $f_3(s,v(s))$ to see what this means.
\[ f_3(s,v(s)) = \frac{\ln(1+s v(s)) - sv(s)}{s^2} + 1 = \frac{\ln(1+t(s)) - t(s)}{s^2} + 1 = 0\]
Solving for $s$ gives:
\[ s^2 = -\lim_{t\rightarrow -1}\left[ \ln(1+t)-t \right]\]
Therefore, we get that $s\rightarrow \pm \infty$ in order to approach the $t=-1$ bound, and since we've already determined that $s\rightarrow\infty$ corresponds to $t\rightarrow\infty$, it must mean that the lower bound on the integral must be $-\infty$. This just happens to be the case for this problem, though most problems like this would require you to "commit exponentially small error", and extend the lower bound to $-\infty$. To stress that the lower bound doesn't mean much, I return to the explanation before that $\exp(-Ns^2)$, for large $N$, dies very quickly for $|s|>0$, so extending the lower bound from $t=-1$ out to $t=-\infty$ is a legal maneuver, especially since the error committed in this step would be smaller than error I'll commit later. And (debatably) for this problem, the change of variables makes the lower bound $-\infty$ anyway, so perhaps this doesn't matter.
Anyway, what we essentially have is:
\[ N! = N \exp\left(N\ln N - N\right) \sum_{k=0}^{\infty} a_k \int_{-\infty}^{\infty} s^k e^{-Ns^2} ds \]
Where $a_0=\sqrt{2}$, $a_1=\frac{4}{3}$, $a_2=\frac{1}{3\sqrt{2}}$, and $a_{k\geq 3}$ has not been found yet, but would be easy to find in principle.
So, to get rid of the $N$ dependence in the exponential, let's change coordinates to: $r=s\sqrt{N}$. This gives:
\[ N! = \sqrt{N} \exp\left(N\ln N - N\right) \sum_{k=0}^{\infty} \frac{a_k}{(\sqrt{N})^k} \int_{-\infty}^{\infty}r^k \exp(-r^2) dr\]
Now, if we could just evaluate that integral on the right, we'd be good to go!
Integrating the Gaussian
Let's look at the integral:\[ \int_{-\infty}^{\infty} x^n e^{-x^2} dx \]
How would we solve it? Well let's first break it up into two integrals:
\[ \int_{-\infty}^0 x^n e^{-x^2} dx + \int_0^{\infty} x^n e^{-x^2} dx \]
With a quick change of variables on the first integral $x\rightarrow -x$, we find:
\[ \int_0^{\infty} (x^n + (-x)^n) e^{-x^2} dx\]
And since $x\geq 0$ over these bounds, this is essentially:
\[ (1-(-1)^n) \int_0^{\infty}x^n e^{-x^2} dx\]
In other words, if $n$ is odd, then the integral disappears - which makes sense. The odd powers of $s$ are odd functions, and the Gaussian term is even, so the integral over the whole domain should be zero (every point on the right is canceled by a point on the left). The even powers group together to give a factor of 2. Therefore, neglecting all the odd powers, and writing $n=2k$, we get:
\[ 2 \int_0^{\infty} x^{2k} e^{-x^2} dx \]
Let's change variables (yes, I know, again!) to $u=x^2$, $du=2xdx$, and $dx=\frac{du}{2\sqrt{u}}$, which makes the integral:
\[ \int_0^{\infty} u^{k-1/2} e^{-k} du\]
Notice any similarities to the top of this page? The Gamma function!
\[ \Gamma\left(k+\frac{1}{2}\right)\]
But now we have fractional powers. Integration by parts would confirm that we're allowed to treat this just like we would any other factorial, and we can expand it recursively $k-1$ times to get:
\[ \left(k-\frac{1}{2}\right)\left(k-1-\frac{1}{2}\right)\left(k-2-\frac{1}{2}\right)...\left(k-(k-1)-\frac{1}{2}\right) \int_0^{\infty} u^{-1/2} e^{-u} du\]
Therefore, if we can figure out that integral on the far right, we'd be done. Note that this integral is merely a special case of $k=n=0$ from before (or in other words, this is $\Gamma\left(\frac{1}{2}\right)$). Let's grab the result before the change of variables:
\[ \Gamma\left(\frac{1}{2}\right) = 2\int_0^{\infty} e^{-x^2}dx\]
Let's do the standard trick, and call this integral $I$. Suppose we solve for the square of $I$:
\[ I^2 = \left[ 2 \int_0^{\infty} e^{-x^2} dx\right]^2 = 4\int_0^{\infty} \int_0^{\infty} \exp(-(x^2+y^2)) dxdy \]
This integral can be solved in polar coordinates. Noting $x=r\cos\theta$ and $y=r\sin\theta$, and $dxdy=rdrd\theta$, we change our integral bounds to get:
\[ I^2 = 4\int_0^{\pi/2} d\theta\int_0^{\infty}r e^{-r^2} dr\]
The first integral becomes a constant, and the second integral can be done via the $u=r^2$ substitution, which gets us a factor of $\frac{1}{2}$. Collecting all the terms gives us:
\[ I^2 = \pi ~~~~\rightarrow~~~~I = \sqrt{\pi}\]
Therefore, we've found that $\Gamma(\frac{1}{2})=\sqrt{\pi}$. Additionally, we can return to our earlier integral and combine all of our information to get:
\[ \int_{-\infty}^{\infty} x^n e^{-x^2} dx = \frac{1-(-1)^n}{2} \Gamma\left(\frac{n+1}{2}\right) = \frac{1-(-1)^n}{2}\left(\frac{n-1}{2}\right)\left(\frac{n-3}{2}\right)...\left(\frac{3}{2}\right)\left(\frac{1}{2}\right) \sqrt{\pi} \]
However, for my purposes before, I only need the first 3 terms:
\[ \int_{-\infty}^{\infty} \left(\sqrt{2} + \frac{4}{3\sqrt{N}} r + \frac{1}{3N\sqrt{2}}r^2 + O\left(\frac{r^3}{N^{3/2}}\right)\right) e^{-r^2} dr \]
The first gets me a $\sqrt{\pi}$, the next is $0$, and the third is $\sqrt{\pi}/2$. Therefore,
\[ = \sqrt{2\pi}\left(1 + \frac{1}{12 N}+O(N^{-2})\right)\]
Which is all we need!
How would we solve it? Well let's first break it up into two integrals:
\[ \int_{-\infty}^0 x^n e^{-x^2} dx + \int_0^{\infty} x^n e^{-x^2} dx \]
With a quick change of variables on the first integral $x\rightarrow -x$, we find:
\[ \int_0^{\infty} (x^n + (-x)^n) e^{-x^2} dx\]
And since $x\geq 0$ over these bounds, this is essentially:
\[ (1-(-1)^n) \int_0^{\infty}x^n e^{-x^2} dx\]
In other words, if $n$ is odd, then the integral disappears - which makes sense. The odd powers of $s$ are odd functions, and the Gaussian term is even, so the integral over the whole domain should be zero (every point on the right is canceled by a point on the left). The even powers group together to give a factor of 2. Therefore, neglecting all the odd powers, and writing $n=2k$, we get:
\[ 2 \int_0^{\infty} x^{2k} e^{-x^2} dx \]
Let's change variables (yes, I know, again!) to $u=x^2$, $du=2xdx$, and $dx=\frac{du}{2\sqrt{u}}$, which makes the integral:
\[ \int_0^{\infty} u^{k-1/2} e^{-k} du\]
Notice any similarities to the top of this page? The Gamma function!
\[ \Gamma\left(k+\frac{1}{2}\right)\]
But now we have fractional powers. Integration by parts would confirm that we're allowed to treat this just like we would any other factorial, and we can expand it recursively $k-1$ times to get:
\[ \left(k-\frac{1}{2}\right)\left(k-1-\frac{1}{2}\right)\left(k-2-\frac{1}{2}\right)...\left(k-(k-1)-\frac{1}{2}\right) \int_0^{\infty} u^{-1/2} e^{-u} du\]
Therefore, if we can figure out that integral on the far right, we'd be done. Note that this integral is merely a special case of $k=n=0$ from before (or in other words, this is $\Gamma\left(\frac{1}{2}\right)$). Let's grab the result before the change of variables:
\[ \Gamma\left(\frac{1}{2}\right) = 2\int_0^{\infty} e^{-x^2}dx\]
Let's do the standard trick, and call this integral $I$. Suppose we solve for the square of $I$:
\[ I^2 = \left[ 2 \int_0^{\infty} e^{-x^2} dx\right]^2 = 4\int_0^{\infty} \int_0^{\infty} \exp(-(x^2+y^2)) dxdy \]
This integral can be solved in polar coordinates. Noting $x=r\cos\theta$ and $y=r\sin\theta$, and $dxdy=rdrd\theta$, we change our integral bounds to get:
\[ I^2 = 4\int_0^{\pi/2} d\theta\int_0^{\infty}r e^{-r^2} dr\]
The first integral becomes a constant, and the second integral can be done via the $u=r^2$ substitution, which gets us a factor of $\frac{1}{2}$. Collecting all the terms gives us:
\[ I^2 = \pi ~~~~\rightarrow~~~~I = \sqrt{\pi}\]
Therefore, we've found that $\Gamma(\frac{1}{2})=\sqrt{\pi}$. Additionally, we can return to our earlier integral and combine all of our information to get:
\[ \int_{-\infty}^{\infty} x^n e^{-x^2} dx = \frac{1-(-1)^n}{2} \Gamma\left(\frac{n+1}{2}\right) = \frac{1-(-1)^n}{2}\left(\frac{n-1}{2}\right)\left(\frac{n-3}{2}\right)...\left(\frac{3}{2}\right)\left(\frac{1}{2}\right) \sqrt{\pi} \]
However, for my purposes before, I only need the first 3 terms:
\[ \int_{-\infty}^{\infty} \left(\sqrt{2} + \frac{4}{3\sqrt{N}} r + \frac{1}{3N\sqrt{2}}r^2 + O\left(\frac{r^3}{N^{3/2}}\right)\right) e^{-r^2} dr \]
The first gets me a $\sqrt{\pi}$, the next is $0$, and the third is $\sqrt{\pi}/2$. Therefore,
\[ = \sqrt{2\pi}\left(1 + \frac{1}{12 N}+O(N^{-2})\right)\]
Which is all we need!
The Final Answer
Combining all of the previous pieces, we arrive at the final answer.
\[ N! = \sqrt{2\pi N} \exp\left(N\ln N - N\right)\left(1+\frac{1}{12N}+O\left(\frac{1}{N^2}\right)\right)\]
Or, as is more common to provide,
\[ \ln(N!) = N \ln N - N + \frac{1}{2}\ln(2\pi N)+\ln\left(1+\frac{1}{12N}+O(N^{-2})\right)\]
The below plot shows how many digits of precision you get (or in other words, the $\log_{10}$ of the relative error) when using a certain approximation.
\[ N! = \sqrt{2\pi N} \exp\left(N\ln N - N\right)\left(1+\frac{1}{12N}+O\left(\frac{1}{N^2}\right)\right)\]
Or, as is more common to provide,
\[ \ln(N!) = N \ln N - N + \frac{1}{2}\ln(2\pi N)+\ln\left(1+\frac{1}{12N}+O(N^{-2})\right)\]
The below plot shows how many digits of precision you get (or in other words, the $\log_{10}$ of the relative error) when using a certain approximation.
\[ \ln N! = N\ln N \]\[ \ln N! = N\ln N - N\]\[\ln N! = N\ln N - N+\frac{1}{2}\ln(2\pi N)\] \[\ln N! = N\ln N - N+ \frac{1}{2}\ln(2\pi N) + \ln\left(1+\frac{1}{12 N}\right)\]
07/03/2014