23 Why Do Models Work?
Overview
Throughout this course, we’ve written things like: \[ Y_i = \mu + \epsilon_i \qqtext{or} Y_i = \mu(W_i) + \epsilon_i \] where \(\epsilon_i\) is “noise” with mean zero. Sometimes we assume \(\epsilon_i \sim N(0, \sigma^2)\). Sometimes we let \(\sigma\) depend on \(W\): \(\epsilon_i \sim N(0, \sigma^2(W_i))\).
But real data doesn’t come from normal distributions. The “errors” aren’t really draws from some probability distribution—they’re just the difference between what we observe and what our model predicts. So why does any of this work?
Planned Content
The Two Roles of Models
- Population description: The model is the population (or describes it exactly)
- Approximation for calibration: The model is a tool for deriving standard errors, even if it’s “wrong”
Homoskedastic vs Heteroskedastic
- \(Y = m(X) + \sigma\epsilon\) vs \(Y = m(X) + \sigma(X)\epsilon\)
- When does the difference matter?
- Robust standard errors: getting calibration right without getting the model right
What the Bootstrap Is Actually Doing
- The bootstrap doesn’t assume normality
- It uses the sample as a model of the population
- Connection to “the model is an approximation”
When Models Break Down
- Heavy tails
- Dependence
- Small samples
- What to watch out for
The Pragmatic View
- Models are tools, not truths
- “All models are wrong, but some are useful” (Box)
- What matters: does calibration work? Is coverage ~95%?
Why This Matters for Next Semester
In regression (Semester 2), you’ll write models like: \[ Y_i = \alpha + \beta X_i + \epsilon_i \] constantly. Understanding when and why these approximations work—and when they don’t—will be essential.