Homework 5
All materials can be found at alexcardazzi.github.io.
Completion Requirements: Complete the following questions in RStudio via the homework template. When you are ready, submit your rendered html
to Canvas.
Grading Criteria: Full credit will be given to correct, well formatted, and detailed answers. Partial credit will be given if I can follow your work and/or see your thought process via code, comments, and text. Point totals are listed next to each question.
Synthetic Control represents a powerful tool for the evaluation of economic policies. In The Mixtape, Scott Cunningham writes:
The estimator has been so influential that Athey and Imbens (2017a) said it was “arguably the most important innovation in the policy evaluation literature in the last 15 years” (p.3).
This homework is based on a paper “Voting after the Bombings: A Natural Experiment on the Effect of Terrorist Attacks on Democratic Elections” by José Montalvo published in The Review of Economics and Statistics (link) that analyzes the impact of a terrorist attack in Madrid, Spain in 2004 on voting outcomes. The key issue in this analysis, like all causal inference examples covered in class, revolves around the comparison of the actual outcomes in Spain after the attack to the unobservable conterfactual potential outcome that would have occurred in Spain in the absence of the attack. In this homework, you will replicate the results of Montalvo (2011).
Before beginning, be sure to carefully read through the paper.
Question 1
- Summarize the main research question and how the author approached answering it. In your answer, be sure to identify the outcome variable(s). (1 Point)
- Provide a discussion about how, if at all, the author addresses backdoors, colliders, etc. (Bonus: 1 Point)
Question 2
For the following questions, use the following data.
- Read the data into an object called
vab
. Below you will find a list of the dataset’s variables copy-and-pasted from the author’s replication package. (0 Points)
- id: number corresponding to each province (id=53 is the aggregate at the national level)
- pp: proportion of vote for the conservative party
- psoe: proportion of vote for the socialist party
- year
- dtreat=1 for treated group
- voters: potential voters (electoral census)
- part: participation rate
- Describe the structure of the data (e.g. the unit of observation, treatment and control groups, etc.). Is this what the “ideal” dataset would look like to answer the research question? (2 Points)
Construct the author’s main outcome variable, and then use
modelsummary
to generate summary statistics for the provinces. (1 Point)Plot time series that shows the value of the outcome variable over time for the treated and control group. This should be a replication of Figure 2 from the paper. (1 Point)
Question 3
For this question, turn to the difference-in-differences analysis in Part III of the paper.
- What are the assumptions the author is making (explicitly and implicitly) in estimating the DiD, and how are they addressing them? (1 Point)
Estimate Equation 1 in the paper. The numbers you get should be very close, but perhaps not exact.1 (1 Point)
Estimate and plot an event study version of the above estimates. Use heteroskedasticity robust standard errors. (1 Point)
Interpret the results of the DiD and the event study, and provide a short discussion of the strengths and weaknesses of this section of the paper. (1 Point)
Question 4
For this question, turn to the synthetic control analysis in Part IV of the paper.
Prepare the data and estimate the synthetic control.2 (1 Point)
Create a table that compares the “real” treated unit, to the synthetic and an equally weighted mean of the units in the donor pool. (1 Point)
Create a table showing the units in the donor pool that were given the most weight. Provide some thoughts on this table. (1 Point)
Plot the outcomes for the synthetic and the real unit. Be sure to include relevant information (e.g. a vertical line showing treatment, a legend, labels for axes, etc.). (1 Point)
What makes this application different from the one about West Germany in the notes, and what do you think of this pivot? (2 Points)
Run (and plot) a placebo tests for the synthetic displayed above. (1 Point)
Estimate a p-value for each time period.3 Interpret the output. (1 Point)
Question 5
- Place both the event study and synthetic control on the same set of axes, and compare them in words. (1 Point)
- Which analysis do you prefer, and why?4 Be sure to consider the assumptions required, the statistical precision, effect magnitude, etc. (2 Points)
Footnotes
Do not worry about tabulating the results – just print the
summary()
.↩︎For a hint, create/add a character version of the ID variable to the datatset, use the entire pre-period for both
time.predictors.prior
andtime.optimize.ssr
, and do not usespecial.predictors
. Use the outcome,pp
, andpsoe
as the predictors.↩︎Do not worry about trimming out poorly fitting placebos, though this is something you would want to address in theory.↩︎
If you do not care for either (or think they’re both good), that is okay, just explain why.↩︎