25 Causality Questions for Unit 2 Exam – Prediction, Inference, and Causality

Question A: Reading a Potential Outcomes Table

Consider this population of 6 people in a randomized experiment.

\(j\)	\(y_j(1)\)	\(y_j(0)\)	\(\tau_j\)
1	6	2	4
2	0	0	0
3	4	1	3
4	7	7	0
5	8	4	4
6	2	0	2

Part A.1

What is the average treatment effect \(\bar\tau = \frac{1}{6}\sum_{j=1}^6 \tau_j\)?

🔒

Solution

Locked (Week 0)

Part A.2

Suppose we assign treatments \(W_1 \ldots W_6 = (1, 0, 0, 1, 1, 0)\). Fill in the realized outcomes \(Y_j = y_j(W_j)\) in the table below.

\(j\)	\(W_j\)	\(Y_j\)
1	1
2	0
3	0
4	1
5	1
6	0

🔒

Solution

Locked (Week 0)

Part A.3

Calculate the difference-in-means estimator \(\hat\tau = \bar Y_1 - \bar Y_0\) for this treatment assignment, where \(\bar Y_1\) is the mean outcome among the treated and \(\bar Y_0\) is the mean among the untreated.

🔒

Solution

Locked (Week 0)

Part A.4

Is \(\hat\tau\) equal to \(\bar\tau\)? In one sentence, explain why they differ.

🔒

Solution

Locked (Week 0)

Question B: Unbiasedness under Randomization

Using the same population as Question A, suppose we randomize by choosing 3 people uniformly at random to treat (and the other 3 are untreated).

Part B.1

There are \(\binom{6}{3} = 20\) possible treatment assignments. For the assignment \((1,1,1,0,0,0)\)—treating persons 1, 2, and 3—calculate \(\hat\tau\).

🔒

Solution

Locked (Week 0)

Part B.2

For the assignment \((0,0,0,1,1,1)\)—treating persons 4, 5, and 6—calculate \(\hat\tau\).

🔒

Solution

Locked (Week 0)

Part B.3

If we average \(\hat\tau\) over all 20 possible assignments (each equally likely), what do we get? You don’t need to enumerate all 20—just state what the answer must be and why.

🔒

Solution

Locked (Week 0)

Question C: The Fundamental Problem

Part C.1

In the table from Question A, why can’t we calculate \(\tau_1 = y_1(1) - y_1(0)\) for person 1 in a real experiment?

🔒

Solution

Locked (Week 0)

Part C.2

Given that we can’t observe individual treatment effects, explain in one or two sentences why randomization still lets us estimate the average treatment effect.

🔒

Solution

Locked (Week 0)

Question D: Variance of the Treatment Effect Estimator

In a randomized experiment with \(n_1\) treated and \(n_0\) untreated units, the variance of \(\hat\tau = \bar Y_1 - \bar Y_0\) is approximately \[ \Var(\hat\tau) \approx \frac{\sigma^2(1)}{n_1} + \frac{\sigma^2(0)}{n_0} \] where \(\sigma^2(w)\) is the variance of potential outcomes under treatment \(w\).

Part D.1

In an experiment with 100 treated and 100 untreated units, suppose \(\hat\sigma^2(1) = 4\) and \(\hat\sigma^2(0) = 9\) (the sample variances in each group). Calculate the estimated standard error of \(\hat\tau\).

🔒

Solution

Locked (Week 0)

Part D.2

If \(\hat\tau = 2.5\), construct a 95% confidence interval for the average treatment effect.

🔒

Solution

Locked (Week 0)

Question E: Two Sources of Randomness (Connects to 10aa)

In some experiments, we sample from a population and then randomize treatment within that sample.

Part E.1

Name the two sources of randomness in such an experiment.

🔒

Solution

Locked (Week 0)

Part E.2

If we observe the entire population (no sampling) and only randomize treatment, which source of randomness remains?

🔒

Solution

Locked (Week 0)

Part E.3

In a large sample from a large population, which source of variability typically dominates? Why?

🔒

Solution

Locked (Week 0)

Question F: Interpreting a Randomized Experiment (Michigan-style)

In a get-out-the-vote experiment, 250 people were randomly assigned to receive a mailer and 250 were assigned to a control group. The voting rates were:

Mailer group: 42% voted
Control group: 35% voted

Part F.1

What is the estimated average treatment effect of receiving the mailer?

🔒

Solution

Locked (Week 0)

Part F.2

The estimated standard error is 0.043. Construct a 95% confidence interval for the ATE.

🔒

Solution

Locked (Week 0)

Part F.3

A colleague says: “The confidence interval includes zero, so the mailer had no effect.” Do you agree? Explain briefly.

🔒

Solution

Locked (Week 0)

Question G: Code Interpretation (Bootstrap for Treatment Effect)

We run this code on data from a randomized experiment stored in a dataframe with columns w (treatment indicator) and y (outcome).

B = 1000
tau.star = rep(NA, B)
for (b in 1:B) {
  I = sample(1:n, n, replace = TRUE)
  w.star = w[I]
  y.star = y[I]
  tau.star[b] = mean(y.star[w.star == 1]) - mean(y.star[w.star == 0])
}

Part G.1

What does tau.star contain after this code runs?

🔒

Solution

Locked (Week 0)

Part G.2

How would you use tau.star to construct a 95% confidence interval for the ATE?

🔒

Solution

Locked (Week 0)

Part G.3

What does sd(tau.star) estimate?

🔒

Solution

Locked (Week 0)

Question Bank: Potential Outcomes

Question A: Reading a Potential Outcomes Table

Part A.1

Part A.2

Part A.3

Part A.4

Question B: Unbiasedness under Randomization

Part B.1

Part B.2

Part B.3

Question C: The Fundamental Problem

Part C.1

Part C.2

Question D: Variance of the Treatment Effect Estimator

Part D.1

Part D.2

Question E: Two Sources of Randomness (Connects to 10aa)

Part E.1

Part E.2

Part E.3

Question F: Interpreting a Randomized Experiment (Michigan-style)

Part F.1

Part F.2

Part F.3

Question G: Code Interpretation (Bootstrap for Treatment Effect)

Part G.1

Part G.2

Part G.3