Homework 4.2

Author

Affiliation

Alex Cardazzi

Old Dominion University

All materials can be found at alexcardazzi.github.io.

Question 1

In the early 2000s, two economists ran an experiment where they sent fictitious resumes in response to job ads in Chicago and Boston. The authors randomly varied the qualities of the fictitious resumes as well as the applicants’ names. Some resumes were randomly given stereotypically white-sounding names (Emily, Brendan) and others African-American-sounding names (Lakisha, Jamal).¹ Intrigued students may read a non-technical summary of the paper here.²

In this part of the homework, you are going to investigate whether employers engage in racial discrimination when sifting through resumes using data collected by these researchers (data; documentation).

Read the data into a data frame called resume. (2 Points)

Create the following binary variables (4 Points):
- A variable called chicago that is equal to one if city is equal to “chicago” and zero otherwise.
- A variable called female that is equal to one if gender is equal to “female” and zero otherwise.
- A variable called black that is equal to one if ethnicity is equal to “afam” and zero otherwise.
- A variable called callback equal to one if call is equal to “yes” and zero otherwise.

Estimate and display the coefficients (using summary() is fine) of the following regression (2 Points):

\[\begin{aligned}\text{Callback}_i = \ &\beta_0 + \beta_1 \text{Jobs}_i + \beta_2 \text{Experience}_i + \beta_3 \text{Female}_i \\&+ \beta_4 \text{Chicago}_i + \beta_5 \text{Black}_i+ \epsilon_i\end{aligned}\]

Interpret each coefficient in words. (4 Points)

Estimate and display the coefficients (using summary() is fine) of the following regression (2 Points):

Interpret the estimates for \(\beta_4\), \(\beta_5\), and \(\beta_6\) from the previous regression. (4 Points)

Re-estimate the regression once using data only from Chicago, and another using data only from Boston. Report the coefficients using modelsummary. Discuss any coefficients that result in different conclusions for the two cities. (4 Points)

Question 2

For this question, you will explore a sample of crash records (data; documentation) reported by police across the country from 1997-2002. Each record in these data contains information about the individual and vehicle involved in the crash, as well as some information about the circumstances and outcomes of the crash.

As a first step, read the data into a data.frame called crash. Subset the data to include only drivers. (2 Points)
Review the data documentation, especially for the variable injSeverity. Remove observations where injSeverity is either missing (NA), unknown, or prior death. Then, create a new variable called y that is equal to one if the individual sustained an incapacitating injury or worse, and zero otherwise. This variable will represent the crash causing a substantial injury. (2 Points)
There is another variable in the dataset called dvcat, which estimates impact speeds in km/h. Convert this to a factor variable, and re-level it such that the reference level is the slowest impact speed. (2 Points)
Re-define the seatbelt and airbag variables to binary indicators. (2 Points)
Estimate a basic regression where major injury is explained by the estimated impact speed, age of the occupant, and year of the vehicle. Display (using summary() is fine) and interpret the coefficients of the model. Note: you do not need to interpret the coefficients for impact speed. Rather, discuss the pattern of the coefficients for that variable. (4 Points)

Re-estimate the model above, but include the variables for the vehicle’s safety features (seatbelt and airbag). What changes about the model? Why do you think you see these changes? (4 Points)

Add the variable deploy to the model, and output the coefficients. What does this variable measure? How does this variable change the interpretation of the model? (4 Points)

Finally, in addition to what is already in the model, incorporate an interaction between deploy and seatbelt. Again, how does the interpretation of the model change? (4 Points)

What, if anything, surprised you about the results in the analysis above? (4 Points)

Footnotes

The process of determining which names are stereotypically black/white is described in detail in the published draft.↩︎
In addition, similar research on ban-the-box finds that these policies increase racial discrimination.↩︎