Tag: machine learning
All the articles with the tag "machine learning".
-
Sudakov minoration, or how big a maximum must be
Posted on:Averages shrink, but maxima grow. Sudakov's minoration inequality is the clean tool for the harder direction: a lower bound on the expected maximum of many Gaussians. As long as no two of them are too alike, that maximum is at least ε times the square root of log N. This is the engine behind a lot of impossibility proofs.
-
Voronoi tessellations and Lloyd's algorithm
Posted on:A set of generators in the plane partitions it into regions, each closer to one generator than to any other. Lloyd's algorithm iterates "move each generator to the centroid of its region" and converges to a centroidal Voronoi tessellation (k-means).
-
Optimal message passing on sparse graphs
Posted on:A condensed walkthrough of our NeurIPS 2023 paper deriving the asymptotically Bayes-optimal classifier for node classification on sparse contextual stochastic block models, and what it implies for the design of graph neural networks.
-
Marchenko-Pastur and the Wigner semicircle
Posted on:The eigenvalues of a large random matrix do not scatter around. They concentrate, as a histogram, on a deterministic shape. For sample covariance matrices the shape is Marchenko-Pastur; for symmetric matrices with i.i.d. entries it is the Wigner semicircle. Both shapes are computable, and they explain precisely why high-dimensional covariance estimation is biased.
-
Stein's paradox
Posted on:In three or more dimensions, the sample mean is dominated everywhere by a shrinkage estimator. The geometric reason is the Gaussian shell: noise pushes you outward, and pulling back is uniformly better. A precursor of ridge regression and most modern regularization.
-
Nearest neighbor breaks in high dimensions
Posted on:In high dimensions, all pairwise distances become essentially equal. Nearest and farthest neighbor are no longer meaningfully different. A short geometric tour of the curse of dimensionality.