Homework 2
All materials can be found at alexcardazzi.github.io.
Completion Requirements: Complete the following questions in RStudio via the homework template. When you are ready, submit your rendered html
to Canvas.
Grading Criteria: Full credit will be given to correct, well formatted, and detailed answers. Partial credit will be given if I can follow your work and/or see your thought process via code, comments, and text. Point totals are listed next to each question.
Besides New Hampshire and Virginia1, car insurance is mandated in every state. The point of this policy is to protect citizens from the often large financial burden resulting from a car crash. However, forcing individuals to purchase insurance may generate negative outcomes through unintended consequences. The often cited reason for this is moral hazard. In other words, if individuals are shielded from the negative consequences of dangerous behavior, they may be more likely to engage in such behavior. Think of this from the opposite point of view: if you knew your vehicle would blow up if you got into a fender bender, you’d probably drive much more safely (or not at all) than you do now.
In this homework, we are going to explore the question: how does auto insurance affect traffic safety?
There are many ways to think through this problem, but let’s start with the following factors that might be related to auto insurance and traffic safety:
- Auto Insurance (AI): whether an individual has auto insurance (treatment).
- Risky Driving Behavior (RDB): an individual’s behavior, such as reckless driving, texting and driving, etc.
- Vehicle Safety Features (VSF): the number of advanced vehicle safety features.
- Traffic Enforcement (TE): the degree to which traffic law enforcement pays attention to the routes the individual travels.
- Age (A): the individual’s age.
- Crash Risk (CR): the probability of being involved in an accident (outcome).
Now, consider the following relationships between the variables:
- We will test both the direct and total effect of auto insurance on crash risk.
- Auto insurance, through moral hazard, effects the risky driving behavior of individuals.
- Risky driving behavior will effect the probability of a crash.
- Driver age will effect crash risk due to age alone (i.e. inexperience) and due to different levels of risky driving behavior.
- Traffic enforcement will influence risky driving behavior.
- Traffic enforcement will influence crash risks by itself by pulling over other reckless drivers.
- Vehicle safety features (due to moral hazard) will influence risky driving behavior.
Question 1
Using dagitty
, Construct and plot a DAG to depict the relationships between these variables. Be sure to indicate which variable is the outcome and which is the treatment. (6 Points)
Question 2
Simulated data for this setting can be found here.
Read the data into an object called
safety
in R. (2 Points)Create summary statistics using functions from the
modelsummary
package. (2 Points)
Question 3
Use functions from dagitty
to evaluate and explore your DAG. This will help inform your empirical specification and identification strategy. You may use code, words, or both to answer these questions.
List each of the paths in the DAG and tell which are “open”. (2 Points)
For each path, discuss why it is open or closed. Be sure to use terms like “confounder”, “collider”, “mediator”, etc. (2 Points)
- AI -> CR:
- AI -> RDB -> CR:
- AI -> RDB <- A -> CR:
- AI -> RDB <- TE -> CR:
- What does it mean, and why is it significant, for a path to be open/closed? (2 Points)
- Which factors do you need to control for to estimate the total causal effect of auto insurance on crash risk? (2 Points)
- Which factors do you need to control for to estimate the direct causal effect of auto insurance on crash risk? You may use
dagitty
to help you, but you must provide explanations fordagitty
’s output. (2 Points)
Question 4
Using safety
, estimate the (direct and total) causal effects of auto insurance on crash risk.
Estimate the models below. (2 Points)
Display the coefficient estimates in a table generated via functions in
modelsummary
. (2 Points)Interpret each coefficient, besides the intercept/constant, in each model. (2 Points)
Question 5
Consider this question using data in real life. What difficulties might you have in estimating these models in reality? (2 Points)
Question 6
Think of two additional variables and/or edges that may belong in this DAG. Make an argument for why they belong in the DAG. (2 Points)
Footnotes
Virginia allows individuals to opt out of insurance, but they then must pay a $500 uninsured fee.↩︎