14  Why the CLT Works: Stein’s Method

Enrichment Content

This is a post-midterm-1 “breather” lecture. The goal is not exam material—it’s to demystify the Central Limit Theorem that students have been relying on throughout Unit 1.

Overview

Throughout this unit, we’ve used the Central Limit Theorem as a black box: sample means are approximately normal when \(n\) is large enough. But why? And what does “large enough” mean?

This lecture opens up the black box. We’ll prove the CLT using Stein’s method—an approach that:

  • Shows why a particular distribution is close to normal (not just that it converges)
  • Gives actual bounds on the approximation error
  • Uses an intuitive idea: if swapping observations doesn’t change the distribution much, you’re close to normal

You won’t be tested on this. But knowing that the CLT is a real theorem with a real proof—one you could understand if you sat with it—is valuable.

Planned Content

The Setup

  • What the CLT actually says (formal statement)
  • What “convergence in distribution” means

Stein’s Characterization of the Normal

  • A distribution is normal if and only if \(\E[f'(Z) - Zf(Z)] = 0\) for all smooth \(f\)
  • Why this characterization is useful

The Exchangeable Pairs Approach

  • If \((W, W')\) are exchangeable and \(W' - W\) is small, then \(W\) is approximately normal
  • Intuition: swapping doesn’t change much → normal

Applying to Sample Means

  • Constructing the exchangeable pair
  • Computing the bound
  • What the bound tells us about “large enough \(n\)

Takeaways

  • The CLT isn’t magic—it’s a theorem with a proof
  • The normal approximation has quantifiable error
  • This explains why bootstrap calibration works

References

To be added: accessible references appropriate for undergraduates.