Heisenberg uncertainty as Fourier duality

The textbook story for the uncertainty principle is that quantum measurements come with a built-in disturbance: pin down a particle’s position with a photon, and the photon kicks the particle’s momentum. This is Heisenberg’s gamma-ray microscope. The picture isn’t wrong, but it puts the principle on the wrong footing. It makes uncertainty sound like a constraint imposed by clumsy detectors, when in fact it is a property of the wave function itself.

The goal of this post is to derive the uncertainty relation from scratch, assuming only what an attentive student would already know after meeting the photoelectric effect and de Broglie’s matter waves. We’ll build up the wave function, define what its spread is, see why position and momentum give two views of the same wave, and finally prove

\sigma_x \, \sigma_p \;\ge\; \tfrac{\hbar}{2}.

Along the way the constant $\hbar$ will appear in exactly one place: as a unit conversion. Strip it out and the inequality reduces to a theorem about Fourier transforms that engineers, musicians, and microscopists already work with under different names. The bandwidth-duration product that bounds how fast a radio channel can transmit, and the Abbe diffraction limit that bounds how small a feature an optical microscope can resolve, are both special cases.

Quantum particles are waves

In 1905 Einstein explained the photoelectric effect by saying that light arrives in chunks of energy $E = hf$ . A wave (the electromagnetic field) carries particle-like packets (photons). Twenty years later, Louis de Broglie suggested the symmetric statement: matter, conventionally thought of as particles, also has a wave description, with wavelength

\lambda \;=\; \frac{h}{p},

where $p$ is the particle’s momentum. This is confirmed in many experiments, from electron diffraction through nickel crystals (Davisson and Germer, 1927) to interference fringes for buckyballs (Zeilinger et al., 1999) and recently for molecules of more than $25{,}000$ atomic mass units.

We don’t read this as an electron is sometimes a particle and sometimes a wave. We read it as: the wave is the fundamental object, and what we call the particle is just the localized point on a detector where a measurement of the wave is recorded.

In one space dimension, the wave function is a complex-valued function $\psi(x)$ defined for every $x \in \mathbb R$ . The Born rule assigns it a physical meaning:

|\psi(x)|^2 \;=\; \text{probability density of finding the particle at } x.

So $\int_a^b |\psi(x)|^2 \, dx$ is the probability that a position measurement returns a value between $a$ and $b$ . We always assume $\psi$ is normalized, $\int |\psi(x)|^2 \, dx = 1$ , so that the total probability is $1$ .

The wave function $\psi$ is a complex number at every $x$ . The probability density $|\psi|^2$ depends only on its modulus, not on its phase, but the phase still matters: it controls how different parts of $\psi$ interfere with each other when we evolve the wave or change basis.

The spread of a distribution

To talk precisely about how localized a particle is, we use the standard deviation of its position distribution:

\langle x\rangle \;=\; \int x \, |\psi(x)|^2 \, dx, \qquad \sigma_x^2 \;=\; \int (x - \langle x\rangle)^2 \, |\psi(x)|^2 \, dx.

If you’ve seen mean and standard deviation in any statistics course, this is the same definition. $\langle x\rangle$ is the average position you’d see if you prepared this state many times and measured each time; $\sigma_x$ is the typical scatter around that average. A small $\sigma_x$ means a tightly localized particle; a large $\sigma_x$ means a spread-out one.

This $\sigma_x$ is the uncertainty in $x$ that ends up on the left of Heisenberg’s inequality. The uncertainty in $p$ will be defined the same way, once we figure out what the momentum probability density is.

The momentum picture is the same wave, looked at differently

Here is the central observation. Consider a state with one perfectly definite momentum $p$ . By de Broglie this corresponds to a single wavelength $\lambda = h/p$ , i.e., a single wavenumber $k = 2\pi/\lambda = p/\hbar$ , where $\hbar = h/(2\pi)$ is the reduced Planck constant. The wave function is a pure complex exponential:

\psi_p(x) \;=\; \frac{1}{\sqrt{2\pi\hbar}} \, e^{ipx/\hbar}.

Notice that $|\psi_p(x)|^2 = 1/(2\pi\hbar)$ is constant in $x$ . A state with definite momentum has uniform position probability over all of space, i.e., is completely delocalized. (This wave function is not strictly normalizable; it is an idealized limit.)

Most physical states are not pure plane waves. They are superpositions of plane waves with different momenta:

\psi(x) \;=\; \frac{1}{\sqrt{2\pi\hbar}} \int_{-\infty}^{\infty} \tilde\psi(p) \, e^{ipx/\hbar} \, dp.

Read this as: build $\psi$ by adding up plane waves of every momentum $p$ , weighted by complex amplitudes $\tilde\psi(p)$ . The function $\tilde\psi(p)$ tells you how much of each momentum is in the mix.

By the Fourier inversion theorem, this relation is invertible:

\tilde\psi(p) \;=\; \frac{1}{\sqrt{2\pi\hbar}} \int_{-\infty}^{\infty} \psi(x) \, e^{-ipx/\hbar} \, dx.

So $\psi(x)$ and $\tilde\psi(p)$ are determined by each other. They are the same physical state written in two different bases: position and momentum.

The Born rule applies in both bases. $|\tilde\psi(p)|^2$ is the probability density for a momentum measurement returning $p$ . Its standard deviation,

\langle p\rangle \;=\; \int p\, |\tilde\psi(p)|^2 \, dp, \qquad \sigma_p^2 \;=\; \int (p - \langle p\rangle)^2 \, |\tilde\psi(p)|^2 \, dp,

is the uncertainty in $p$ .

The relation between $\psi(x)$ and $\tilde\psi(p)$ above is the standard Fourier transform of mathematics, with one convention-dependent rescaling. Setting $k = p/\hbar$ removes $\hbar$ from the integral entirely:

\tilde\psi(k) \;=\; \frac{1}{\sqrt{2\pi}} \int \psi(x) \, e^{-ikx} \, dx.

Whatever inequality holds between $\sigma_x$ and $\sigma_k$ for arbitrary square-integrable $\psi$ is a fact of harmonic analysis, not a fact of quantum mechanics. We translate to the quantum statement at the end by writing $\sigma_p = \hbar \sigma_k$ .

Localized in space means spread in wavenumber

Two extreme examples make the qualitative tradeoff visible.

(a) A pure plane wave, $\psi(x) = e^{ik_0 x}$ extended over all $x$ . Its Fourier transform is a Dirac delta at $k = k_0$ . Spread in $x$ : infinite. Spread in $k$ : zero.

(b) A Dirac delta at $x_0$ , $\psi(x) = \delta(x - x_0)$ . Its Fourier transform is the constant $e^{-ikx_0}/\sqrt{2\pi}$ . Spread in $x$ : zero. Spread in $k$ : infinite.

Realistic wave functions sit between these two extremes. Squeezing the position support of $\psi$ requires combining many wavelengths, which is what spreads $\tilde\psi$ in $k$ . Conversely, building a sharp peak at one wavenumber requires letting $\psi$ extend over many wavelengths in $x$ .

This tradeoff is well known from contexts that have nothing to do with quantum mechanics:

A radio engineer designing a transmitter knows that a pulse of duration $T$ contains frequencies spread over at least $\sim 1/T$ . This is the constraint behind channel bandwidth allocation: faster Morse keying or higher digital data rates need proportionally wider bands. The 22 MHz channel widths in Wi-Fi 802.11 are sized for the symbol rates QAM modulation actually carries.
A pianist knows that hitting a key for $50$ ms gives a percussive thud with no clean pitch, while a sustained tone has a well-defined pitch. The difference is visible on a spectrogram: vertical streaks at the attack (transients are broad in frequency, narrow in time) and clean horizontal lines once the note settles (narrow in frequency, extended in time).
An optical engineer knows that a small aperture diffracts a broad spot. Confining a beam in space spreads it in angle. This is the Airy disk seen through a telescope and the diffraction-limited spot size of a laser focused through a lens.

Each of these is the same mathematical statement, applied to time and frequency, or to space and angle, instead of position and wavenumber.

The precise quantitative version, established with the constant $\tfrac{1}{2}$ by Hermann Weyl and others around 1928, is

\sigma_x \, \sigma_k \;\ge\; \tfrac{1}{2}.

Substituting $\sigma_p = \hbar \sigma_k$ on the left:

\sigma_x \, \sigma_p \;\ge\; \tfrac{\hbar}{2}.

This is the Heisenberg uncertainty relation. The single factor of $\hbar$ that distinguishes it from a textbook signal-processing identity is just unit conversion.

See for yourself

The Gaussian wave packet has a particularly clean Fourier transform: another Gaussian. Take

\psi(x) \;=\; \frac{1}{(2\pi\sigma_x^2)^{1/4}} \, e^{-x^2/(4\sigma_x^2)} \, e^{i p_0 x / \hbar}.

This describes a localized blob centered at the origin with width $\sigma_x$ , modulated by an oscillation of average momentum $p_0$ . Its Fourier transform is

\tilde\psi(p) \;\propto\; e^{-(p - p_0)^2 / (2 \sigma_p^2)}, \qquad \sigma_p \;=\; \frac{\hbar}{2 \sigma_x}.

The product $\sigma_x \sigma_p$ is locked at exactly $\hbar/2$ . Among all wave functions, the Gaussian saturates the Heisenberg inequality at equality. Any other shape has a strictly larger product.

In the widget below, $\hbar = 1$ . Slide $\sigma_x$ and watch the position packet narrow while the momentum distribution widens, with the product fixed. The carrier momentum $p_0$ shifts the momentum peak sideways without changing its width; in position space, increasing $p_0$ packs more oscillations into the same envelope. The result is the wave packet picture: a localized blob whose internal oscillations have de Broglie wavelength $h/p_0$ .

A two-line proof, with the scaffolding spelled out

The proof of $\sigma_x \sigma_p \ge \hbar/2$ uses three ingredients from the wave-function picture. Each looks heavy at first; each maps to something familiar.

(i) Operators. A measurement is represented by a linear map on wave functions. The position operator is multiply by $x$ :

(X \psi)(x) \;=\; x \, \psi(x).

The momentum operator is determined by demanding that a plane wave $e^{ipx/\hbar}$ should be its eigenfunction with eigenvalue $p$ . Differentiating $e^{ipx/\hbar}$ gives $(ip/\hbar) e^{ipx/\hbar}$ , so the operator that produces $p$ as eigenvalue is $-i\hbar \, d/dx$ :

(P \psi)(x) \;=\; -i\hbar \, \frac{d\psi}{dx}.

This is the same momentum that appears in $p = h/\lambda$ .

(ii) Expectation values. For an operator $A$ and a normalized state $\psi$ , the expected value of the corresponding measurement is

\langle A\rangle \;=\; \int \psi^*(x) \, (A\psi)(x) \, dx,

where $\psi^*$ is the complex conjugate of $\psi$ . For $A = X$ this reduces to $\int x |\psi|^2 \, dx$ , the average position we already defined. The standard deviation of the measurement satisfies $\sigma_A^2 = \langle A^2\rangle - \langle A\rangle^2$ , the ordinary statistics formula.

For Hermitian operators (operators that produce real expectation values for any state), the standard deviation has another useful form: $\sigma_A^2 = \langle (A - \langle A\rangle)^2\rangle$ . Both $X$ and $P$ are Hermitian.

(iii) Commutators. Two operators $A$ and $B$ generally do not commute, in the sense that $AB$ and $BA$ act differently on a state. Their commutator

[A, B] \;=\; A B - B A

is itself an operator, measuring how much the order of application matters. For position and momentum, a one-line product-rule calculation gives

[X, P]\psi(x) \;=\; x\bigl(-i\hbar \, \psi'(x)\bigr) \;-\; \bigl(-i\hbar \, (x\psi(x))'\bigr) \;=\; i\hbar \, \psi(x).

So $[X, P] = i\hbar$ as an operator identity, holding on every state $\psi$ . This is the fact that ultimately produces the $\hbar/2$ on the right of Heisenberg.

With these ingredients, the proof is two lines. Define the centered operators $A' = A - \langle A\rangle$ and $B' = B - \langle B\rangle$ (these are also Hermitian). Apply the Cauchy-Schwarz inequality, which says that for any two square-integrable functions $\varphi$ and $\chi$ ,

\biggl|\int \varphi^* \chi \, dx\biggr|^2 \;\le\; \biggl(\int |\varphi|^2 \, dx\biggr) \biggl(\int |\chi|^2 \, dx\biggr).

Set $\varphi = A'\psi$ and $\chi = B'\psi$ . The right side is $\langle A'^2\rangle \langle B'^2\rangle = \sigma_A^2 \sigma_B^2$ . The left side is $|\langle A' B'\rangle|^2$ . So

\sigma_A^2 \, \sigma_B^2 \;\ge\; |\langle A' B'\rangle|^2.

Now decompose

A' B' \;=\; \tfrac{1}{2}\bigl(A'B' + B'A'\bigr) \;+\; \tfrac{1}{2}\bigl(A'B' - B'A'\bigr) \;=\; \tfrac{1}{2}\{A',B'\} \;+\; \tfrac{1}{2}[A',B'].

The first piece (the anticommutator, $\{A',B'\}$ ) is Hermitian, so its expectation is real. The second piece (the commutator) is anti-Hermitian, so its expectation is purely imaginary. Hence

|\langle A' B'\rangle|^2 \;\ge\; \biggl|\tfrac{1}{2}\langle [A', B']\rangle\biggr|^2 \;=\; \tfrac{1}{4}|\langle [A, B]\rangle|^2,

(using $[A', B'] = [A, B]$ , since adding constants doesn’t change the commutator). Putting it together:

\sigma_A \, \sigma_B \;\ge\; \tfrac{1}{2} \, \big|\langle [A, B]\rangle\big|.

This is the Robertson inequality. Plugging in $[X, P] = i\hbar$ gives $|\langle [X, P]\rangle| = \hbar$ , hence

\sigma_x \, \sigma_p \;\ge\; \tfrac{\hbar}{2}.

The right-hand side is the constant $\hbar/2$ for every state $\psi$ , because the commutator $[X, P]$ is itself a state-independent constant. That state-independence is the punchline: no matter how cleverly you prepare a particle, this product cannot drop below $\hbar/2$ .

Why the Gaussian saturates

Cauchy-Schwarz becomes an equality precisely when one vector is a scalar multiple of the other. So the uncertainty bound is hit with equality only for states satisfying

P'\psi \;=\; i\lambda \, X'\psi

for some real $\lambda$ . (The factor of $i$ comes from a second condition required for full saturation: the symmetric piece $\langle\{X', P'\}\rangle$ must also vanish.) With $P = -i\hbar \, d/dx$ , this is the first-order linear ODE

-i\hbar \, \psi'(x) \;-\; \langle p\rangle \, \psi(x) \;=\; i\lambda \, (x - \langle x\rangle) \, \psi(x).

Separating variables and integrating, $\psi(x)$ is a Gaussian envelope multiplied by a complex exponential. The minimum-uncertainty wave packet is exactly the Gaussian wave packet shown in the widget. That is why the readout never strays from $\sigma_x \sigma_p = 0.5$ no matter where you put the sliders.

The Gaussian’s recurring role across probability theory and harmonic analysis is not a coincidence: it is the unique (up to scaling) fixed point of the Fourier transform, and the limit law of the central limit theorem.

A square pulse, a Lorentzian, a triangular bump: all obey $\sigma_x \sigma_p > \hbar/2$ strictly. The Gaussian sits exactly on the boundary.

Implications

Some physical consequences whose status as Heisenberg corollaries is sometimes obscured:

Atomic stability. Confining an electron to a region of size $r$ forces a momentum spread of order $\hbar/r$ by the inequality, hence a kinetic energy of order $\hbar^2/(2 m r^2)$ . Coulomb attraction contributes potential energy $-e^2/r$ . Minimizing the sum

$E(r) \;=\; \frac{\hbar^2}{2 m r^2} \;-\; \frac{e^2}{r}$

in $r$ gives the Bohr radius $a_0 = \hbar^2/(m e^2) \approx 0.53\,$ Å at energy $\approx -13.6\,$ eV. The hydrogen atom’s ground state is the size it is because tightly confining the electron costs more kinetic energy than the Coulomb attraction can pay for.
Single-slit diffraction. A photon passing through a slit of width $a$ has its transverse position localized to $\sigma_x \sim a$ . Heisenberg then forces a transverse momentum spread $\sigma_p \sim \hbar/a$ . The angular spread of the resulting beam is $\sigma_p / p \sim \hbar/(a p)$ , which (using $p = h/\lambda$ ) is of order $\lambda/a$ : the standard diffraction scale.
Time-frequency duality. The same inequality applied to time and angular frequency (a Fourier-conjugate pair, with $\hbar = 1$ since they are dimensionally inverse already) says a musical note of duration $T$ has its pitch defined to about $1/T$ . Audio codecs like MP3 and AAC face this tradeoff in their FFT window length: short windows preserve drum attacks but blur frequency, long windows resolve frequencies cleanly but smear transients into pre-echo artifacts. The codec switches between window sizes in response to the signal for exactly this reason.
Diffraction limit in microscopy. Resolving features smaller than $\lambda/(2\,\mathrm{NA})$ requires light with a correspondingly large transverse momentum spread, which is what the numerical aperture controls. This is Abbe’s diffraction limit of classical microscopy, and the obstacle that super-resolution methods like STED and PALM had to engineer around: work that won the 2014 Nobel Prize in Chemistry.

A common pitfall

Heisenberg’s gamma-ray microscope thought experiment is a consistency check: it shows that any plausible measurement scheme, taken seriously, bumps into the same bound. It is not the origin of the bound. The wave function is uncertain even when nobody is looking, because $\psi(x)$ and $\tilde\psi(p)$ are two views of the same vector, and a basis change cannot turn a function spread out in one variable into one localized in the other.

The disturbance interpretation is fine pedagogy as long as it is taken as illustration, not as derivation. The inequality lives in the structure of the wave function itself.

References

Werner Heisenberg. Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik. Zeitschrift für Physik, 43:172-198, 1927. The original (semi-rigorous) statement.
Earle Hesse Kennard. Zur Quantenmechanik einfacher Bewegungstypen. Zeitschrift für Physik, 44:326-352, 1927. The first explicit $\sigma_x \sigma_p \ge \hbar/2$ .
H. P. Robertson. The uncertainty principle. Physical Review, 34(1):163-164, 1929. The general two-operator inequality used above.
John Stewart Bell. Speakable and Unspeakable in Quantum Mechanics. Cambridge University Press, 1987. Conceptual critique of the disturbance interpretation.
David J. Griffiths and Darrell F. Schroeter. Introduction to Quantum Mechanics. 3rd ed., Cambridge University Press, 2018. Sections 1.6 and 3.5 for the textbook proof.
Leon Cohen. Time-Frequency Analysis: Theory and Applications. Prentice Hall, 1995. The signal-processing reading of the same inequality.