Skip to content

Heisenberg uncertainty as Fourier duality

Posted on:

The textbook story for the uncertainty principle is that quantum measurements come with a built-in disturbance: pin down a particle’s position with a photon, and the photon kicks the particle’s momentum. This is Heisenberg’s gamma-ray microscope. The picture isn’t wrong, but it puts the principle on the wrong footing. It makes uncertainty sound like a constraint imposed by clumsy detectors, when in fact it is a property of the wave function itself.

The goal of this post is to derive the uncertainty relation from scratch, assuming only what an attentive student would already know after meeting the photoelectric effect and de Broglie’s matter waves. We’ll build up the wave function, define what its spread is, see why position and momentum give two views of the same wave, and finally prove

σxσp    2.\sigma_x \, \sigma_p \;\ge\; \tfrac{\hbar}{2}.

Along the way the constant \hbar will appear in exactly one place: as a unit conversion. Strip it out and the inequality reduces to a theorem about Fourier transforms that engineers, musicians, and microscopists already work with under different names. The bandwidth-duration product that bounds how fast a radio channel can transmit, and the Abbe diffraction limit that bounds how small a feature an optical microscope can resolve, are both special cases.

Quantum particles are waves

In 1905 Einstein explained the photoelectric effect by saying that light arrives in chunks of energy E=hfE = hf. A wave (the electromagnetic field) carries particle-like packets (photons). Twenty years later, Louis de Broglie suggested the symmetric statement: matter, conventionally thought of as particles, also has a wave description, with wavelength

λ  =  hp,\lambda \;=\; \frac{h}{p},

where pp is the particle’s momentum. This is confirmed in many experiments, from electron diffraction through nickel crystals (Davisson and Germer, 1927) to interference fringes for buckyballs (Zeilinger et al., 1999) and recently for molecules of more than 25,00025{,}000 atomic mass units.

We don’t read this as an electron is sometimes a particle and sometimes a wave. We read it as: the wave is the fundamental object, and what we call the particle is just the localized point on a detector where a measurement of the wave is recorded.

In one space dimension, the wave function is a complex-valued function ψ(x)\psi(x) defined for every xRx \in \mathbb R. The Born rule assigns it a physical meaning:

ψ(x)2  =  probability density of finding the particle at x.|\psi(x)|^2 \;=\; \text{probability density of finding the particle at } x.

So abψ(x)2dx\int_a^b |\psi(x)|^2 \, dx is the probability that a position measurement returns a value between aa and bb. We always assume ψ\psi is normalized, ψ(x)2dx=1\int |\psi(x)|^2 \, dx = 1, so that the total probability is 11.

The wave function ψ\psi is a complex number at every xx. The probability density ψ2|\psi|^2 depends only on its modulus, not on its phase, but the phase still matters: it controls how different parts of ψ\psi interfere with each other when we evolve the wave or change basis.

The spread of a distribution

To talk precisely about how localized a particle is, we use the standard deviation of its position distribution:

x  =  xψ(x)2dx,σx2  =  (xx)2ψ(x)2dx.\langle x\rangle \;=\; \int x \, |\psi(x)|^2 \, dx, \qquad \sigma_x^2 \;=\; \int (x - \langle x\rangle)^2 \, |\psi(x)|^2 \, dx.

If you’ve seen mean and standard deviation in any statistics course, this is the same definition. x\langle x\rangle is the average position you’d see if you prepared this state many times and measured each time; σx\sigma_x is the typical scatter around that average. A small σx\sigma_x means a tightly localized particle; a large σx\sigma_x means a spread-out one.

This σx\sigma_x is the uncertainty in xx that ends up on the left of Heisenberg’s inequality. The uncertainty in pp will be defined the same way, once we figure out what the momentum probability density is.

The momentum picture is the same wave, looked at differently

Here is the central observation. Consider a state with one perfectly definite momentum pp. By de Broglie this corresponds to a single wavelength λ=h/p\lambda = h/p, i.e., a single wavenumber k=2π/λ=p/k = 2\pi/\lambda = p/\hbar, where =h/(2π)\hbar = h/(2\pi) is the reduced Planck constant. The wave function is a pure complex exponential:

ψp(x)  =  12πeipx/.\psi_p(x) \;=\; \frac{1}{\sqrt{2\pi\hbar}} \, e^{ipx/\hbar}.

Notice that ψp(x)2=1/(2π)|\psi_p(x)|^2 = 1/(2\pi\hbar) is constant in xx. A state with definite momentum has uniform position probability over all of space, i.e., is completely delocalized. (This wave function is not strictly normalizable; it is an idealized limit.)

Most physical states are not pure plane waves. They are superpositions of plane waves with different momenta:

ψ(x)  =  12πψ~(p)eipx/dp.\psi(x) \;=\; \frac{1}{\sqrt{2\pi\hbar}} \int_{-\infty}^{\infty} \tilde\psi(p) \, e^{ipx/\hbar} \, dp.

Read this as: build ψ\psi by adding up plane waves of every momentum pp, weighted by complex amplitudes ψ~(p)\tilde\psi(p). The function ψ~(p)\tilde\psi(p) tells you how much of each momentum is in the mix.

By the Fourier inversion theorem, this relation is invertible:

ψ~(p)  =  12πψ(x)eipx/dx.\tilde\psi(p) \;=\; \frac{1}{\sqrt{2\pi\hbar}} \int_{-\infty}^{\infty} \psi(x) \, e^{-ipx/\hbar} \, dx.

So ψ(x)\psi(x) and ψ~(p)\tilde\psi(p) are determined by each other. They are the same physical state written in two different bases: position and momentum.

The Born rule applies in both bases. ψ~(p)2|\tilde\psi(p)|^2 is the probability density for a momentum measurement returning pp. Its standard deviation,

p  =  pψ~(p)2dp,σp2  =  (pp)2ψ~(p)2dp,\langle p\rangle \;=\; \int p\, |\tilde\psi(p)|^2 \, dp, \qquad \sigma_p^2 \;=\; \int (p - \langle p\rangle)^2 \, |\tilde\psi(p)|^2 \, dp,

is the uncertainty in pp.

The relation between ψ(x)\psi(x) and ψ~(p)\tilde\psi(p) above is the standard Fourier transform of mathematics, with one convention-dependent rescaling. Setting k=p/k = p/\hbar removes \hbar from the integral entirely:

ψ~(k)  =  12πψ(x)eikxdx.\tilde\psi(k) \;=\; \frac{1}{\sqrt{2\pi}} \int \psi(x) \, e^{-ikx} \, dx.

Whatever inequality holds between σx\sigma_x and σk\sigma_k for arbitrary square-integrable ψ\psi is a fact of harmonic analysis, not a fact of quantum mechanics. We translate to the quantum statement at the end by writing σp=σk\sigma_p = \hbar \sigma_k.

Localized in space means spread in wavenumber

Two extreme examples make the qualitative tradeoff visible.

(a) A pure plane wave, ψ(x)=eik0x\psi(x) = e^{ik_0 x} extended over all xx. Its Fourier transform is a Dirac delta at k=k0k = k_0. Spread in xx: infinite. Spread in kk: zero.

(b) A Dirac delta at x0x_0, ψ(x)=δ(xx0)\psi(x) = \delta(x - x_0). Its Fourier transform is the constant eikx0/2πe^{-ikx_0}/\sqrt{2\pi}. Spread in xx: zero. Spread in kk: infinite.

Realistic wave functions sit between these two extremes. Squeezing the position support of ψ\psi requires combining many wavelengths, which is what spreads ψ~\tilde\psi in kk. Conversely, building a sharp peak at one wavenumber requires letting ψ\psi extend over many wavelengths in xx.

This tradeoff is well known from contexts that have nothing to do with quantum mechanics:

Each of these is the same mathematical statement, applied to time and frequency, or to space and angle, instead of position and wavenumber.

The precise quantitative version, established with the constant 12\tfrac{1}{2} by Hermann Weyl and others around 1928, is

σxσk    12.\sigma_x \, \sigma_k \;\ge\; \tfrac{1}{2}.

Substituting σp=σk\sigma_p = \hbar \sigma_k on the left:

σxσp    2.\sigma_x \, \sigma_p \;\ge\; \tfrac{\hbar}{2}.

This is the Heisenberg uncertainty relation. The single factor of \hbar that distinguishes it from a textbook signal-processing identity is just unit conversion.

See for yourself

The Gaussian wave packet has a particularly clean Fourier transform: another Gaussian. Take

ψ(x)  =  1(2πσx2)1/4ex2/(4σx2)eip0x/.\psi(x) \;=\; \frac{1}{(2\pi\sigma_x^2)^{1/4}} \, e^{-x^2/(4\sigma_x^2)} \, e^{i p_0 x / \hbar}.

This describes a localized blob centered at the origin with width σx\sigma_x, modulated by an oscillation of average momentum p0p_0. Its Fourier transform is

ψ~(p)    e(pp0)2/(2σp2),σp  =  2σx.\tilde\psi(p) \;\propto\; e^{-(p - p_0)^2 / (2 \sigma_p^2)}, \qquad \sigma_p \;=\; \frac{\hbar}{2 \sigma_x}.

The product σxσp\sigma_x \sigma_p is locked at exactly /2\hbar/2. Among all wave functions, the Gaussian saturates the Heisenberg inequality at equality. Any other shape has a strictly larger product.

In the widget below, =1\hbar = 1. Slide σx\sigma_x and watch the position packet narrow while the momentum distribution widens, with the product fixed. The carrier momentum p0p_0 shifts the momentum peak sideways without changing its width; in position space, increasing p0p_0 packs more oscillations into the same envelope. The result is the wave packet picture: a localized blob whose internal oscillations have de Broglie wavelength h/p0h/p_0.

σx = 1.00
σp = 0.50
σx · σp = 0.50 = ℏ/2
(ℏ = 1 in these units)

A two-line proof, with the scaffolding spelled out

The proof of σxσp/2\sigma_x \sigma_p \ge \hbar/2 uses three ingredients from the wave-function picture. Each looks heavy at first; each maps to something familiar.

(i) Operators. A measurement is represented by a linear map on wave functions. The position operator is multiply by xx:

(Xψ)(x)  =  xψ(x).(X \psi)(x) \;=\; x \, \psi(x).

The momentum operator is determined by demanding that a plane wave eipx/e^{ipx/\hbar} should be its eigenfunction with eigenvalue pp. Differentiating eipx/e^{ipx/\hbar} gives (ip/)eipx/(ip/\hbar) e^{ipx/\hbar}, so the operator that produces pp as eigenvalue is id/dx-i\hbar \, d/dx:

(Pψ)(x)  =  idψdx.(P \psi)(x) \;=\; -i\hbar \, \frac{d\psi}{dx}.

This is the same momentum that appears in p=h/λp = h/\lambda.

(ii) Expectation values. For an operator AA and a normalized state ψ\psi, the expected value of the corresponding measurement is

A  =  ψ(x)(Aψ)(x)dx,\langle A\rangle \;=\; \int \psi^*(x) \, (A\psi)(x) \, dx,

where ψ\psi^* is the complex conjugate of ψ\psi. For A=XA = X this reduces to xψ2dx\int x |\psi|^2 \, dx, the average position we already defined. The standard deviation of the measurement satisfies σA2=A2A2\sigma_A^2 = \langle A^2\rangle - \langle A\rangle^2, the ordinary statistics formula.

For Hermitian operators (operators that produce real expectation values for any state), the standard deviation has another useful form: σA2=(AA)2\sigma_A^2 = \langle (A - \langle A\rangle)^2\rangle. Both XX and PP are Hermitian.

(iii) Commutators. Two operators AA and BB generally do not commute, in the sense that ABAB and BABA act differently on a state. Their commutator

[A,B]  =  ABBA[A, B] \;=\; A B - B A

is itself an operator, measuring how much the order of application matters. For position and momentum, a one-line product-rule calculation gives

[X,P]ψ(x)  =  x(iψ(x))    (i(xψ(x)))  =  iψ(x).[X, P]\psi(x) \;=\; x\bigl(-i\hbar \, \psi'(x)\bigr) \;-\; \bigl(-i\hbar \, (x\psi(x))'\bigr) \;=\; i\hbar \, \psi(x).

So [X,P]=i[X, P] = i\hbar as an operator identity, holding on every state ψ\psi. This is the fact that ultimately produces the /2\hbar/2 on the right of Heisenberg.

With these ingredients, the proof is two lines. Define the centered operators A=AAA' = A - \langle A\rangle and B=BBB' = B - \langle B\rangle (these are also Hermitian). Apply the Cauchy-Schwarz inequality, which says that for any two square-integrable functions φ\varphi and χ\chi,

φχdx2    (φ2dx)(χ2dx).\biggl|\int \varphi^* \chi \, dx\biggr|^2 \;\le\; \biggl(\int |\varphi|^2 \, dx\biggr) \biggl(\int |\chi|^2 \, dx\biggr).

Set φ=Aψ\varphi = A'\psi and χ=Bψ\chi = B'\psi. The right side is A2B2=σA2σB2\langle A'^2\rangle \langle B'^2\rangle = \sigma_A^2 \sigma_B^2. The left side is AB2|\langle A' B'\rangle|^2. So

σA2σB2    AB2.\sigma_A^2 \, \sigma_B^2 \;\ge\; |\langle A' B'\rangle|^2.

Now decompose

AB  =  12(AB+BA)  +  12(ABBA)  =  12{A,B}  +  12[A,B].A' B' \;=\; \tfrac{1}{2}\bigl(A'B' + B'A'\bigr) \;+\; \tfrac{1}{2}\bigl(A'B' - B'A'\bigr) \;=\; \tfrac{1}{2}\{A',B'\} \;+\; \tfrac{1}{2}[A',B'].

The first piece (the anticommutator, {A,B}\{A',B'\}) is Hermitian, so its expectation is real. The second piece (the commutator) is anti-Hermitian, so its expectation is purely imaginary. Hence

AB2    12[A,B]2  =  14[A,B]2,|\langle A' B'\rangle|^2 \;\ge\; \biggl|\tfrac{1}{2}\langle [A', B']\rangle\biggr|^2 \;=\; \tfrac{1}{4}|\langle [A, B]\rangle|^2,

(using [A,B]=[A,B][A', B'] = [A, B], since adding constants doesn’t change the commutator). Putting it together:

σAσB    12[A,B].\sigma_A \, \sigma_B \;\ge\; \tfrac{1}{2} \, \big|\langle [A, B]\rangle\big|.

This is the Robertson inequality. Plugging in [X,P]=i[X, P] = i\hbar gives [X,P]=|\langle [X, P]\rangle| = \hbar, hence

σxσp    2.\sigma_x \, \sigma_p \;\ge\; \tfrac{\hbar}{2}.

The right-hand side is the constant /2\hbar/2 for every state ψ\psi, because the commutator [X,P][X, P] is itself a state-independent constant. That state-independence is the punchline: no matter how cleverly you prepare a particle, this product cannot drop below /2\hbar/2.

Why the Gaussian saturates

Cauchy-Schwarz becomes an equality precisely when one vector is a scalar multiple of the other. So the uncertainty bound is hit with equality only for states satisfying

Pψ  =  iλXψP'\psi \;=\; i\lambda \, X'\psi

for some real λ\lambda. (The factor of ii comes from a second condition required for full saturation: the symmetric piece {X,P}\langle\{X', P'\}\rangle must also vanish.) With P=id/dxP = -i\hbar \, d/dx, this is the first-order linear ODE

iψ(x)    pψ(x)  =  iλ(xx)ψ(x).-i\hbar \, \psi'(x) \;-\; \langle p\rangle \, \psi(x) \;=\; i\lambda \, (x - \langle x\rangle) \, \psi(x).

Separating variables and integrating, ψ(x)\psi(x) is a Gaussian envelope multiplied by a complex exponential. The minimum-uncertainty wave packet is exactly the Gaussian wave packet shown in the widget. That is why the readout never strays from σxσp=0.5\sigma_x \sigma_p = 0.5 no matter where you put the sliders.

The Gaussian’s recurring role across probability theory and harmonic analysis is not a coincidence: it is the unique (up to scaling) fixed point of the Fourier transform, and the limit law of the central limit theorem.

A square pulse, a Lorentzian, a triangular bump: all obey σxσp>/2\sigma_x \sigma_p > \hbar/2 strictly. The Gaussian sits exactly on the boundary.

Implications

Some physical consequences whose status as Heisenberg corollaries is sometimes obscured:

  1. Atomic stability. Confining an electron to a region of size rr forces a momentum spread of order /r\hbar/r by the inequality, hence a kinetic energy of order 2/(2mr2)\hbar^2/(2 m r^2). Coulomb attraction contributes potential energy e2/r-e^2/r. Minimizing the sum

    E(r)  =  22mr2    e2rE(r) \;=\; \frac{\hbar^2}{2 m r^2} \;-\; \frac{e^2}{r}

    in rr gives the Bohr radius a0=2/(me2)0.53a_0 = \hbar^2/(m e^2) \approx 0.53\, Å at energy 13.6\approx -13.6\,eV. The hydrogen atom’s ground state is the size it is because tightly confining the electron costs more kinetic energy than the Coulomb attraction can pay for.

  2. Single-slit diffraction. A photon passing through a slit of width aa has its transverse position localized to σxa\sigma_x \sim a. Heisenberg then forces a transverse momentum spread σp/a\sigma_p \sim \hbar/a. The angular spread of the resulting beam is σp/p/(ap)\sigma_p / p \sim \hbar/(a p), which (using p=h/λp = h/\lambda) is of order λ/a\lambda/a: the standard diffraction scale.

  3. Time-frequency duality. The same inequality applied to time and angular frequency (a Fourier-conjugate pair, with =1\hbar = 1 since they are dimensionally inverse already) says a musical note of duration TT has its pitch defined to about 1/T1/T. Audio codecs like MP3 and AAC face this tradeoff in their FFT window length: short windows preserve drum attacks but blur frequency, long windows resolve frequencies cleanly but smear transients into pre-echo artifacts. The codec switches between window sizes in response to the signal for exactly this reason.

  4. Diffraction limit in microscopy. Resolving features smaller than λ/(2NA)\lambda/(2\,\mathrm{NA}) requires light with a correspondingly large transverse momentum spread, which is what the numerical aperture controls. This is Abbe’s diffraction limit of classical microscopy, and the obstacle that super-resolution methods like STED and PALM had to engineer around: work that won the 2014 Nobel Prize in Chemistry.

A common pitfall

Heisenberg’s gamma-ray microscope thought experiment is a consistency check: it shows that any plausible measurement scheme, taken seriously, bumps into the same bound. It is not the origin of the bound. The wave function is uncertain even when nobody is looking, because ψ(x)\psi(x) and ψ~(p)\tilde\psi(p) are two views of the same vector, and a basis change cannot turn a function spread out in one variable into one localized in the other.

The disturbance interpretation is fine pedagogy as long as it is taken as illustration, not as derivation. The inequality lives in the structure of the wave function itself.

References



Next Post
Voronoi tessellations and Lloyd's algorithm