Skip to content

Voronoi tessellations and Lloyd's algorithm

Posted on:

Place kk points p1,,pkp_1, \ldots, p_k in the plane (called generators). The Voronoi cell of pip_i is

Vi  =  {xR2:xpixpj for all ji}.V_i \;=\; \{x \in \mathbb{R}^2 : \|x - p_i\| \le \|x - p_j\| \text{ for all } j \ne i\}.

Each cell is a convex polygon, and the cells together partition the plane. Random generators give irregular cells of varying sizes and shapes. Carefully placed generators give regular tilings. There is a procedure that takes any starting configuration and pulls it toward a maximally regular one.

Lloyd’s algorithm

At each step:

  1. Compute the Voronoi cells V1,,VkV_1, \ldots, V_k for the current generators.
  2. Replace each generator pip_i with the centroid of its cell: pi  =  1ViVixdx.p_i' \;=\; \frac{1}{|V_i|} \int_{V_i} x \, dx.
  3. Repeat.

This is Lloyd’s algorithm (Lloyd, 1957/1982). It decreases the total squared distance from each point in the plane to its nearest generator at every step (a Lyapunov / energy function), so it converges to a local minimum.

The fixed points are configurations where every generator is already at the centroid of its own cell. These are called centroidal Voronoi tessellations (CVTs). On flat domains, the cells of a CVT are nearly hexagonal: hexagons are the polygon shape that minimizes the second moment of area per unit area, so the algorithm seeks them out (see The bees for more details on this minimization).

Click on the canvas to add generators. Drag a generator to move it. Press step to run one Lloyd iteration, or run to step until convergence. Watch the cells round out and the configuration regularize.

generators: 0iteration: 0click to add · drag to move

Press random 24 generators for a fresh starting configuration, then run 30 steps to watch Lloyd converge. Press step if you want to see the iteration in single-step mode.

After enough iterations on a rectangular domain, the cells become approximately hexagonal in the interior and adapt to the boundary near the edges. The convergence is monotone in the energy

F(p1,,pk)  =  i=1kVixpi2dx,F(p_1, \ldots, p_k) \;=\; \sum_{i=1}^{k} \int_{V_i} \|x - p_i\|^2 \, dx,

the total squared distance to the nearest generator. Each step strictly decreases FF unless the configuration is already a CVT.

Where this connects

The bees

Bees build approximately hexagonal honeycombs, and they are often described as running Lloyd’s algorithm in nature. Not quite: there are two distinct mathematical optimality statements that both have the hexagonal tiling as their answer in 2D, and bees and Lloyd’s algorithm solve different ones.

Hales’s Honeycomb Theorem (Hales, 2001). Among all tilings of the plane into regions of equal area, the regular hexagonal tiling minimizes the total perimeter. This is the optimization that minimizes wax usage for a given total storage volume. The conjecture goes back at least to Pappus of Alexandria in the fourth century AD; the proof is modern.

Lloyd’s CVT in the flat-domain, uniform-density limit minimizes a different functional, the second moment of area per unit area:

1VVxp2dx,\frac{1}{|V|} \int_V \|x - p\|^2 \, dx,

where pp is the centroid of VV.

The integrand xp2\|x - p\|^2 is the squared distance from a point of the cell to its centroid. Its integral Vxp2dx\int_V \|x - p\|^2 \, dx is the second moment of area of VV about its centroid: a measure of how spread out the cell is, small for compact round cells and large for elongated or branching ones. Dividing by V|V| gives the average value of xp2\|x - p\|^2 over a uniformly random point of VV.

For shapes of fixed area, this average depends only on the shape of VV, not its size. The minimum over all shapes of a given area is achieved by the disk (round cells have the lowest spread), but disks do not tile the plane. The relevant question for Lloyd’s algorithm is therefore: among shapes that do tile the plane, which one is closest to a disk in this sense? With cells of unit area, the comparison among the three regular polygons that tile, plus the disk for reference, is

The hexagon comes within 0.7%0.7\% of the disk lower bound; the square is about 5%5\% over, the triangle about 21%21\% over. Among polygons that tile the plane (not only the regular ones), the regular hexagon minimizes this functional (L. Fejes Tóth, 1959). The analogous statement in d3d \ge 3, that the optimal tiling for moment of inertia is by truncated octahedra, is Gersho’s conjecture, still open in general dimensions.

So the perimeter minimum and the moment-of-inertia minimum are mathematically distinct objectives, but they happen to be solved by the same shape in 2D. That is the reason bees and k-means converge on hexagons through completely unrelated processes.

The bee mechanism is not direct geometry. Bees do not measure angles. The accepted picture, going back to D’Arcy Thompson (1917) and experimentally confirmed by Karihaloo, Zhang, and Wang (2013), is that bees first build approximately circular cells out of soft wax, and the wax then relaxes under the heat of the bees’ bodies and the equal mutual pressure of neighboring cells. Surface tension drives the boundaries to a configuration that minimizes wax for fixed cell area, which by the Honeycomb Theorem is the hexagonal tiling.

In this sense, bees do calculus the same way soap films do calculus: the physical system minimizes the same variational integral that the mathematical statement names. Lloyd’s algorithm solves a different but adjacent variational problem by direct iteration, and arrives at the same answer because the answer is hexagons either way.

References



Next Post
Pólya's recurrence theorem