B=1000
Y.bar = rep(NA,B)
for(b in 1:B) {
Y.bar[b] = mean(sample(Y,size=n,replace=TRUE))
}24 Practice Midterm 2
QTM 285
\[ \DeclareMathOperator{\E}{E} \DeclareMathOperator{\Var}{V} \DeclareMathOperator{\hVar}{\widehat{V}} \DeclareMathOperator{\bias}{bias} \newcommand{\model}{\mathcal{M}} \]
Problem 1
Suppose we have a sample \(Y_1 \ldots Y_n\) drawn with replacement from a population \(y_1 \ldots y_m\) with mean \(\mu = \frac{1}{m}\sum_{j=1}^m y_j\). We have it in an vector \(Y\) of length \(n\). We run this code.
Locked (Week 0)
Locked (Week 0)
Problem 2
Suppose \(Y_1 \ldots Y_n\) are sampled with replacement from a population with mean \(\mu=1\) and variance \(\sigma^2=1\). In Midterm 1, we discussed the estimator \(\tilde{Y}_k = \frac{1}{n-k} \sum_{i=1}^n Y_i\) for \(k=2\). Here, we’ll talk about that case and the case \(k=\sqrt{n}\).
Locked (Week 0)
Problem 3
In the plot below, I’m showing some data from a (fake) survey of people in your city who’ve recently finished with school and begun working. For whatever reason, nobody who attends school in this city goes to college, so everyone finishes with between 8 and 12 years of education. There are 10,000 of these people, and we mailed out (and received back) 2000 surveys, sampling from this population of 10,000 with replacement. The x-axis shows the survey recipient’s years of education, the y-axis sjpws their income, the points with arms are within-column means and standard deviations, and the heights of the bars are proportional to the proportion of recipients with each level of education.
Locked (Week 0)
These survey recipients went to two schools: School 0 (circles) and School 1 (triangles).1
Children in your district go to School 0. Your cousin, who lives in a district that goes to School 1, encourages you to pretend that you and your family live with them so that your child can go to School 1. This is a thing that happens, at least on the TV show ‘Friday Night Lights.’ Their argument is that School B must be better because the average income of students who have attended it is clearly higher. Check out this confidence interval for the difference in mean incomes of students who have attended the two schools, they say.
Delta.hat = mean(Y[W==1]) - mean(Y[W==0])
Delta.hat.star = replicate(1000, {
J.star = sample(1:n, size=n, replace=TRUE)
W.star = W[J.star]
X.star = X[J.star]
Y.star = Y[J.star]
mean(Y.star[W.star==1]) - mean(Y.star[W.star==0])
})
interval.width = width(Delta.hat.star)
confidence.interval = c(Delta.hat - interval.width/2,
Delta.hat + interval.width/2)
confidence.interval[1] 7323 12428
They even based their calculation on one of your QTM 285 homework solutions, so it’s got to be legit. School 1 is at least $7323 better.
You, on the other hand, look at the data and claim that a year at School 0 is worth just as much as a year at School 1. And you’re right. You calculate school-specific versions of \(\hat\theta_{\text{years}}\), bootstrap the difference between them, and find convincing evidence that the incremental value of a year at each school is the roughly the same.
your.summary = function(W,X,Y) {
muhat = Vectorize(function(w,x) { mean(Y[W==w & X==x]) })
(mean(muhat(1,9:12)-muhat(1,8:11))) - (mean(muhat(0,9:12)-muhat(0,8:11)))
}
theta.hat.years.difference = your.summary(W,X,Y)
theta.hat.years.difference.star = replicate(100, {
J.star = sample(1:n, size=n, replace=TRUE)
your.summary(W[J.star], X[J.star], Y[J.star])
})
interval.width = width(theta.hat.years.difference.star)
confidence.interval = c(theta.hat.years.difference - interval.width/2,
theta.hat.years.difference + interval.width/2)
confidence.interval[1] -1518 1302
Problem 4
Consider this population of 6 people in a randomized experiment.
| \(j\) | \(y_j(1)\) | \(y_j(0)\) | \(\tau_j\) |
|---|---|---|---|
| 1 | 6 | 2 | 4 |
| 2 | 0 | 0 | 0 |
| 3 | 4 | 1 | 3 |
| 4 | 7 | 7 | 0 |
| 5 | 8 | 4 | 4 |
| 6 | 2 | 0 | 2 |
Locked (Week 0)
Locked (Week 0)
Locked (Week 0)
Locked (Week 0)
Now suppose we randomize by choosing 3 people uniformly at random to treat (and the other 3 are untreated). There are \(\binom{6}{3} = 20\) possible treatment assignments.
Locked (Week 0)
Locked (Week 0)
Locked (Week 0)
Usually I use color instead of shape for this, but working with features we can plot in grayscale on exams will save colored ink and that saves the department some trouble.↩︎