Why Schrödinger's equation?

[2012-11-05 Mon]

"Why this equation?"

I recently overheard someone ask this about Schrödinger's equation. The answer they received was, for me, unsatisfying. "Because it agrees with experiment." Of course, that answers perfectly why the equation was adopted by future generations of physicists and indeed the calculation of the spectrum of atomic hydrogen from the energy eigenvalues of the Schrödinger operator is one of the most convincing and wholesome computations a young physicist can do. But the question that was left unanswered, the question I believe was being asked, was: "Why did Schrödinger write this equation down? Why not something else?" I don't believe for a second that Schrödinger sat down with an array of different equations and worked out what each of them predicted about hydrogen before he found the one that fit...

I turned to one of Schrödinger's (eminently readable) original papers on the subject:

E.Schrödinger, An Undulatory Theory of the Mechanics of Atoms and Molecules, Physical Review (1926) Vol. 28, No. 6 pp. 1049-1070

and was overjoyed to find that Schrödinger had a very definite picture in mind when he derived his equation. The idea was this: some ninety-nine years previously, William Rowan Hamilton had presented his general theory of geometric optics to the Royal Irish Academy, a mathematical description of light rays. This theory is a good approximation to reality when the light a has very short wavelength (like violet) but doesn't account for various optical phenomena like diffraction, which require the finer description of light as an electromagnetic wave. This description was put on a mathematical footing by Maxwell who proved that electromagnetic fields propagate in free space according to the wave equation.

Hamilton later applied his formalism to describe classical mechanics. Schrödinger and de Broglie thought that one might reasonably expect there to be a wave theory underlying classical mechanics and reducing to it in the short wavelength limit. The difficulty was how to guess a wave equation that would give the right short wavelength limit.

Schrödinger took as his starting point the Hamilton-Jacobi equation, so let's review this. I'll assume you're happy with the usual Hamiltonian/Lagrangian formulation of classical mechanics.

Hamilton-Jacobi theory

For any pair of points \(A,\ B\in\mathbf{R}^n\) consider the space of paths \(\gamma\colon [0,T]\to\mathbf{R}^n\) between \(A\) and \(B\). Let \(L\) be a Lagrangian and \[I(\gamma)=\int_0^TL(t,\gamma(t),\dot{\gamma}(t))dt\] be the action of that path. A classical path of time \(T\) joining \(A\) and \(B\) is a solution to the corresponding Euler-Lagrange equation. Let's suppose we're in an ideal situation: for any \(T\) and any pair of points \(A\), \(B\) there is a unique classical path \(\gamma_{A,B,T}\) of time \(T\) joining \(A\) and \(B\).

Fix \(A\), but allow \(B\) and \(T\) to vary. Define the function \[W(B,T)=I(\gamma_{A,B,T}).\]

The Hamilton-Jacobi equation is a PDE satisfied by this function. Let's first compute the derivatives of \(W\) with respect to \(B\). Replace \(B\) by \(B+b\) and suppose that \(\gamma_{A,B+b,T}(t)=\gamma_{A,B,T}(t)+\eta(t)\). Then, for small \(b\), writing \(\gamma_{A,B,T}(t)=(x_1(t),\ldots,x_n(t))\), we have \[I(\gamma_{A,B+b,T})=I(\gamma_{A,B,T})+\int_0^T\left(\frac{\partial L}{\partial x_i}-\frac{d}{dt}\frac{\partial L}{\partial\dot{x}_i}\right)dt+\left[\frac{\partial L}{\partial \dot{x}_i}\eta_i(t)\right]_0^T+\cdots\] by the usual Euler-Lagrange argument for computing the first variation of \(I\). Since \(\gamma_{A,B,T}\) is the classical path, the first term vanishes. Since \(\eta_i(0)=0\) (the point \(A\) is fixed) the only remaining term is \(\frac{\partial L}{\partial \dot{x}_i}(T)\eta_i(T)\). Since \(b_i=\eta_i(T)\), this means that the first variation of \(W\) is \[\frac{\partial W}{\partial B_i}=\frac{\partial L}{\partial\dot{x}_i}(T)\] By Hamilton's equations, \(\frac{\partial L}{\partial\dot{x}_i}=p_i\) so this says that \(\partial W/\partial B_i\) is the ith component of momentum at the endpoint of the path.

Now, by the fundamental theorem of calculus: \[\frac{dW}{dt}=L(T,B_i,\dot{B}_i)\] but by the chain rule \[\frac{dW}{dT}=\frac{\partial W}{\partial T}+\sum_i\frac{\partial W}{\partial B_i}\dot{B}_i\] This gives \[\frac{\partial W}{\partial T}=L-\sum_ip_i\dot{B}_i\] where \(p_i\) is the momentum at the endpoint. Since the Hamiltonian \(H\) and Lagrangian \(L\) are related by a Legendre transform, we have \[L(T,B_i,\dot{B}_i)-\sum_ip_i\dot{B}_i=-H(T,B_i,p_i)=-H\left(T,B_i,\frac{\partial W}{\partial B_i}\right)\] so we see that \(W\) satisfies the Hamilton-Jacobi equation \[\frac{\partial W}{\partial T}=-H(T,B,\nabla W).\]

Autonomous case

If we assume that the Hamiltonian is of the form \[\frac{1}{2m}(p_1^2+p_2^2+p_3^2)+V(x,y,z)\] (in particular time-independent) then we know by energy conservation that \[\partial^2W/\partial t^2=\partial H/\partial t=0\] so \[W(t,x,y,z)=-Et+S(x,y,z)\] for some constant \(E\), and the Hamilton-Jacobi equation reduces to \[|\nabla S|=\sqrt{2m(E-V(x,y,z))}.\] We also have the momentum \(p=\nabla S\) from our earlier computation so the speed \(\sqrt{\dot{x}^2+\dot{y}^2+\dot{z}^2}\) is \(|p|/m=|\nabla S|/m=\sqrt{\frac{2(E-V(x,y,z))}{m}}\).

Schrödinger's idea

In going from geometric optics to wave optics you imagine little sine waves travelling along your rays and you imagine that the phase of the sine wave changes linearly with the optical path length. Via the optical/mechanical analogy (the direct correspondence between the Hamiltonian formalism of optics and of mechanics) one translates optical path length into the action \(W\), so Schrödinger's guess was to replace classical trajectories by sine waves whose phase is proportional to the function \(W\). In other words, he postulates a wavefunction \[\psi(x,y,z,t)=A(x,y,z)\sin(W(x,y,z,t)/K)=A(x,y,z)\sin(-Et/K+S(x,y,z)/K)\] for some constant \(K\). This constant has units of action so that the argument of sine is dimensionless.

The frequency of this sine wave is \(E/2\pi K\) so, comparing with the empirical relationship coming from the Einstein/Planck analyses of the photoelectric effect/black body radiation formula, Schrödinger guessed \[K=h/2\pi=\hbar.\]

I want to quickly recall the notion of phase velocity. This is different to the velocity of our classical particles – rather it is the speed of a crest of the underlying wave. The crest is a surface of constant phase \(W=\pi/2\), that is at each instant \(t\) the crest is a level surface of the function \(S\), that is \(S=\pi/2+Et\). Let \(u\) be the vector field \(E\nabla S/|\nabla S|^2\) and let \((x(t),y(t),z(t))\) be an integral curve of \(u\) starting at time 0 on the crest. Then \[\frac{d}{dt}W(x(t),y(t),z(t),t)=-E+\nabla S\cdot u=0\] so the integral curve keeps up with the crest. We think of \(u\) as the phase velocity vector, so the phase speed is \[E/|\nabla S|=\frac{E}{\sqrt{2m(E-V)}}\] Crucially, the phase speed depends on \(E\) which depends on the frequency, so the wave equation which underlies Schrödinger's matter waves must be dispersive (which means exactly that the phase speed depends on the frequency – in other words, different frequencies will disperse because they travel with different speeds). Schrödinger next made the simplest guess as to what the equation should be governing waves of a fixed frequency, namely he guessed the usual wave equation \[\Delta\psi=\frac{1}{u^2}\frac{\partial^2\psi}{\partial t^2}\] for waves whose time dependence is through a factor \(e^{2\pi iEt/h}\). The usual (light) wave equation replaces \(u\) by the constant \(c\). Here \(u\) is given by the dispersion relation \[u=\frac{E}{\sqrt{2m(E-V)}}.\] Since \(\psi\) has time dependence \(e^{2\pi iEt/h}\) this means that \[\partial^2\psi/\partial t^2=-\frac{E^2}{\hbar^2}\psi\] Now subsituting this and the dispersion relation into the wave equation gives \[\Delta\psi=\frac{2m(E-V)}{E^2}\frac{E^2}{\hbar^2}\psi\] or the more familiar \[-\frac{\hbar^2}{2m}\Delta\psi+V\psi=E\psi\] which is Schrödinger's equation.

For me, this route to Schrödinger's equation seems extremely natural when compared to Dirac's magic with Poisson brackets.