Imagination Exercise
All materials can be found at alexcardazzi.github.io.
Completion Requirements: Complete the following questions in RStudio via the imagination template. When you are ready, submit your rendered html
to Canvas. Be sure to prepare your pitch for an in class discussion.
Grading Criteria: Full credit will be given to well formatted and detailed answers. Partial credit will be given if I can follow your work and/or see your thought process via code, comments, and text. Point totals are listed next to each question.
Assignment Summary
Malcolm Gladwell, the host of Revisionist History, spends six episodes asking (social) scientists about what their “magic wand experiment” would be.
What if you could design any experiment you wanted? Without worrying about money, ethics, logistics, or even the laws of nature? Revisionist History kicks off the season by giving some of the world’s smartest scientists a magic wand to create the experiment of their dreams.
These scientists come up with a bunch of different studies and research designs that would allow them to differentiate between correlation and causality. In this assignment, students will have to imagine up their own magic wand experiments with one condition: each experiment or study must use the particular causal inference strategy being studied. Students will have to write up their research designs and pitch them to the class.
Your imagination exercise should address the following questions. Write your answers in a Quarto document and submit a properly rendered HTML file on Canvas. Your submission should contain the exact same headings shown below along with your text addressing all questions.
Note: your first imagination exercise will be a bit different than what is described in the previous paragraph. Students will be asked to present their research topic ideas from Project Checkpoint 2. Each student will get feedback and constructive criticism from the class on their ideas. This presentation can be thought of as a research project version of “Shark Tank”. Students need only to submit their presentation to Canvas for this first imagination exercise.
Research Setting
Introduce and describe the necessary background information on your research setting. What happened? Who or what was effected, and by what? What is the time frame? What is the research question you think this setting might help you answer? The setting studied should be either historic (e.g. actually happened) or at least a realistic hypothetical. (4 Points)
Setting Example
On April 15th, 1947, Jackie Robinson, an African American baseball player, played his first game with the Brooklyn Dodgers. Robinson became the first African American to play in Major League Baseball (“MLB”), acting as an important precursor to the civil rights movement in the ’50s and ’60s. Slowly, over the next decade or so, MLB teams began to integrate. At this point in history, segregation was very much alive and Robinson became a divisive figure.
The research question I am interested in exploring is how did the timing of the decision by the local MLB team to integrate effect segregation? For example, the Brooklyn Dodgers integrated in 1947, but the Pittsburgh Pirates did not integrate until 1954, the Atlanta Braves until 1950, the Chicago White Sox in 1951, etc. Did Brooklyn see reductions in segregation, relative to Pittsburgh, Atlanta, and other cities with still-segregated teams, following 1947?Application Description
Discuss how the particular method we are studying will help you answer the research question. How would the method work in the context of this setting? Discuss the assumptions needed for this method to deliver credibly causal estimates, and critically evaluate how likely the assumptions are to hold in this setting. Are there any potential backdoors or colliders you need to be wary of? (8 Points)
Data Description
Describe the ideal data set for identifying the causal effect of interest described in the application above. You don’t need to constrain yourself to known, existing variables and data sets, although identifying actual data is OK. You may assume you have unlimited ability/resources to measure variables you want. Be sure to comment on the structure of the data. In other words, identify the unit of observation, the dependent variable, and the explanatory variables. (3 Points)
Data Description Example
The ideal dataset would contain city-by-month level data with measures of segregation (dep. variable), demographic information, and an indicator for whether the local MLB team had integrated. Measuring segregation could be difficult, so I am leaning on the “unlimited ability to measure variables” part of the assignment to deal with that issue. Demographic information might include percent of the population that is white, the percent of the population that identify as anti-segregationists, percent of the population that are baseball fans, and median age of the population. City and time fixed effects would also be used in this analysis.Empirical Model
Write down a regression model (or series of models) that corresponds to your setting and question, and identify the causal parameter of interest.1 If the equation is too complex, you may articulate it in words, although be warned that this can sometimes be trickier. (3 Points)
Empirical Model Example
Segregation \(S\) in city \(c\) and year-month \(t\) is modeled as:
\[S_{ct} = \delta I_{ct} + X_{ct}\beta + \alpha_c + \tau_t + \epsilon_{ct}\] where \(I_{ct}\) represents an indicator for whether the local team had integrated, \(X_{ct}\) represents a vector of control variables, and \(\alpha_c\) and \(\tau_t\) represent city and time fixed effects.
Another specification of this model could be:
\[\text{Alternate Specification Here}\] where the indicator is interacted with the percent of the population that identify as baseball fans. I anticipate the effects of integration to be stronger in places where more people are baseball fans since they would be paying more attention to when their team integrates. Also, I omit the control variable that tells the percent of the population that identifies as anti-segregationists. Controlling for this variable might in turn shut down a path by which MLB integration effects segregation throughout the rest of the city.
More here…(Expected) Findings
What do you expect to find? Who does the answer to this question help, and how/why? (2 Points)
(Expected) Findings Example
I expect MLB team integration will alleviate segregation as people become more accustomed to the idea by watching their team integrate. More here…
or
I do not expect MLB team integration to effect segregation in the rest of the city. This is because team owners are they themselves citizens of their MLB team’s city, and thus reflect (albeit potentially biased) city-wide opinions on segregation. Therefore, teams that integrate first are likely already in more liberal cities with individuals more likely to embrace ending segregation. From this point of view, we would need to be very careful about omitted variable bias! More here…
or
I believe MLB team integration will intensify segregation, as conservative fans feel their power slipping in sports so they tighten their grasp elsewhere in society. More here…In Class Presentation
Give a short presentation of your imagination exercise to the class. Be sure to be engaged in the exercises presented by your classmates. (5 Points)
Next, prepare a presentation of a published (or working) paper using the causal inference method we are studying. A presentation template can be found here. To assist in finding papers, navigate here (summary) and/or here. Students might find it advantageous to present papers similar to their research topics. Finally, all papers need to be approved via email. Papers published in the following journals are generally unlikely to be approved for presentation (though there are exceptions on a case-by-case basis): American Economic Review, Quarterly Journal of Economics, Review of Economics and Statistics, Econometrica, Review of Economic Studies, Journal of Economic Perspectives, Journal of Economic Literature. (15 Points)
Footnotes
You may exclude any control variables or fixed effects that are not essential to the casual inference strategy.↩︎