Local Projections Difference-in-Differences

Alexander Cardazzi

2024-05-16

Getting Started

Please see the working paper on which this package is based here (Dube et al., 2023).

To install and load this package, use the following code:

if(!"devtools" %in% installed.packages()) install.packages("devtools")
if(!"lpdid" %in% installed.packages()) devtools::install_github("alexCardazzi/lpdid")
library("lpdid")

A Simple 2x2 DiD Example

To begin, consider a simple 2x2 DiD where treatment begins at \(t = 31\). Only twenty out of fifty units get treated, and the treatment dynamics cause the outcome variable to increase linearly with time. Below, the outcome is averaged by time and whether the unit receives treatment at some point.

library("fixest")

set.seed(757)
I <- 50; T <- 50; N <- I * T
data.frame(id = rep(1:I, each = T),
           t = rep(1:T, I),
           e = rnorm(N, 0, 4)) -> df
unit_FE <- rnorm(I, 0, 4)
time_FE <- rnorm(T, 0, 4)
df$treated <- ifelse(df$id < 21, 1, 0)
df$time_til <- ifelse(df$treated == 1, df$t - 30, -1)
df$treat_status <- ifelse(df$treated == 1 & df$t == 30, 1, 0)
df$y <- df$e + unit_FE[df$id] + time_FE[df$t] + ifelse(df$id < 21 & df$t > 30, df$t - 30, 0)
aggregate(df$y, list(df$t, df$treated), mean) -> tmp
par(mar = c(4.1, 4.1, .1, 4.1))
plot(tmp$Group.1, tmp$x, col = tmp$Group.2 + 1, pch = 19,
     xlab = "Time", ylab = "Outcome")
legend("topleft", legend = c("Treated", "Control"),
       bty = "n", pch = 19, col = c("tomato", "black"))

In the most simple of cases, LP-DiD and TWFE preform very similarly. Of course, TWFE is not biased in this setting, so we observe nearly identical parameter estimates (and standard errors). The important arguments for lpdid() are the following:

lpdid(df, window = c(-20, 20), y = "y",
      unit_index = "id", time_index = "t",
      treat_status = "treat_status") -> reg
feols(y ~ i(time_til, treated, -1) | id + t, data = df) -> twfe

par(mar = c(4.1, 4.1, .1, 4.1))
plot_lpdid(reg, col = "tomato", cex = 1.3)
iplot(twfe, add = TRUE)
legend("topleft", legend = c("TWFE", "LP-DiD"),
       col = c("black", "tomato"), pch = c(20, 19), bty = "n")

Staggered (Exogenous) Treatment Timing

If you are reading this, it is likely that you are aware that TWFE estimates are potentially biased depending on treatment timing and dynamics. In Dube et al. (2023), the authors simulate two sets of data: one where treatment timing is exogenous and the other where it is endogenous. First, LP-DiD is applied to the data where treatment is exogenous.

lpdid has a built-in function to replicate (as closely as possible) the simulations in Dube et al. (2023).1 Using this, we estimate TWFE and LP-DiD models in addition to plotting the “true” average beta values.

df <- genr_sim_data(exogenous_timing = TRUE, 789)

twfe1 <- feols(y ~ i(rel_time, ever_treat, -1) | i + t, data = df)

lpdid1 <- lpdid(df, window = c(-5, 10), y = "y",
                unit_index = "i", time_index = "t",
                treat_status = "treat_status")

par(mar = c(4.1, 4.1, .1, 4.1))
plot_lpdid(lpdid1, col = "tomato", x.shift = 0.1)
iplot(twfe1, add = TRUE, x.shift = -0.1)
points(aggregate(df$beta[df$ever_treat == 1],
                 list(df$rel_time[df$ever_treat == 1]), mean))
legend("topleft", c("True Beta", "TWFE", "LP-DiD"), bty = "n",
       pch = c(1, 19, 19), col = c("black", "black", "tomato"))

Staggered (Endogenous) Treatment Timing

For endogenous treatment timing, the main idea is that treatment is more likely following a large negative shock. See page 28 of the paper for more details on this. However, given this, we can include a lag of the outcome variable into the model. Below, four event studies are plotted: TWFE and LP-DiD, both with and without a lag of the outcome.

df <- genr_sim_data(exogenous_timing = FALSE, 789)
# The returned "df" is a pdata.frame, so lag() works in a panel setting.
df$lag_y <- lag(df$y, 1)
twfe0 <- feols(y ~ i(rel_time, ever_treat, -1) | i + t, data = df)
twfe1 <- feols(y ~ lag_y + i(rel_time, ever_treat, -1) | i + t, data = df)
lpdid0 <- lpdid(df, window = c(-5, 10), y = "y", outcome_lags = 0,
                unit_index = "i", time_index = "t", treat_status = "treat_status")
lpdid1 <- lpdid(df, window = c(-5, 10), y = "y", outcome_lags = 1,
                unit_index = "i", time_index = "t", treat_status = "treat_status")
par(mar = c(4.1, 4.1, .1, 4.1))
plot_lpdid(lpdid0, col = "tomato", x.shift = 0.1)
plot_lpdid(lpdid1, col = "dodgerblue", x.shift = 0.1, add = TRUE)
iplot(twfe0, add = TRUE, x.shift = -0.1)
iplot(twfe1, add = TRUE, x.shift = -0.1, col = "gray")
points(aggregate(df$beta[df$ever_treat == 1], list(df$rel_time[df$ever_treat == 1]), mean))
legend("topleft", bty = "n",
       c("True Beta", "TWFE", "TWFE + Y_lag", "LP-DiD", "LP-DiD + Y_lag"),
       pch = c(1, 19, 19, 19, 19),
       col = c("black", "black", "gray", "tomato", "dodgerblue"))

Andrew Baker’s Simulated Dataset

Andrew Baker created a simulated data (found here) in order to demonstrate the potential pit falls of using TWFE in a setting with staggered treatment timing and temporal treatment dynamics. Below is an aggregated version of this data where average outcomes are calculated by time and treatment group.

baker <- haven::read_dta('https://github.com/scunning1975/mixtape/raw/master/baker.dta')
baker$treated <- ifelse(baker$treat_date > 0, 1, 0)
baker$treat_status <- ifelse(baker$treat_date == baker$year, 1, 0)

tmp <- aggregate(list(y = baker$y),
                 list(year = baker$year,
                      group = baker$group),
                 mean)
par(mar = c(4.1, 4.1, .1, 4.1))
plot(tmp$year, tmp$y, pch = 19,
     col = scales::alpha(tmp$group, .4),
     xlab = "Year", ylab = "Outcome")
lines(tmp$year[tmp$group == unique(tmp$group)[1]],
      tmp$y[tmp$group == unique(tmp$group)[1]],
      col = tmp$group[tmp$group == unique(tmp$group)[1]])
lines(tmp$year[tmp$group == unique(tmp$group)[2]],
      tmp$y[tmp$group == unique(tmp$group)[2]],
      col = tmp$group[tmp$group == unique(tmp$group)[2]])
lines(tmp$year[tmp$group == unique(tmp$group)[3]],
      tmp$y[tmp$group == unique(tmp$group)[3]],
      col = tmp$group[tmp$group == unique(tmp$group)[3]])
lines(tmp$year[tmp$group == unique(tmp$group)[4]],
      tmp$y[tmp$group == unique(tmp$group)[4]],
      col = tmp$group[tmp$group == unique(tmp$group)[4]])
abline(v = unique(baker$treat_date)-0.5, col = 1:4)

When estimating a simple TWFE model, the following is the estimate generated by OLS:

feols(y ~ treat | id + year, data = baker)
#> OLS estimation, Dep. Var.: y
#> Observations: 30,000 
#> Fixed-effects: id: 1,000,  year: 30
#> Standard-errors: Clustered (id) 
#>       Estimate Std. Error  t value  Pr(>|t|)    
#> treat -6.69019   0.670213 -9.98218 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 29.1     Adj. R2: 0.803436
#>              Within R2: 0.003954

Of course, the treatment effect is surely non-negative. Below, we estimate TWFE, Sun & Abraham, and LP-DiD models. Here, just to exemplify another feature of the package, one can re-weight the regression to obtain an ATT rather than a VWATT. Here, the results are nearly identical to the estimates via Sun & Abraham. Also, note that the window ((-20, 20)) is wider than what is plotted (-17, 17). This is a feature since the FEs and the treatment variable are perfectly colinear outside what is returned.

res_naive = feols(y ~ i(time_til, treated, ref = -1) | 
                    id + year,
                  baker, cluster = ~state)

res_cohort = feols(y ~ sunab(treat_date, year) | id + year, 
                   baker[baker$year<2004,],
                   cluster = ~state)

lpdid(df = baker, window = c(-20, 20), y = "y",
      unit_index = "id", time_index = "year",
      treat_status = "treat_status", reweight = TRUE) -> reg

par(mar = c(4.1, 4.1, .1, 4.1))
iplot(res_naive, col = "black", grid = F,
      xlab = "Time to Treatment", main = "")
iplot(res_cohort, col = "dodgerblue", add = TRUE)
plot_lpdid(reg, add = TRUE, col = "tomato", pch = 1, cex = 1.5)
legend("topright", col = c("black", "tomato", "dodgerblue"),
       pch = c(20, 20, 1), bty = "n",
       legend = c("TWFE", "Sun & Abraham", "LP-DiD"))

It is also possible to pool together the treatment effects and get standard errors using the lpdid function.

lpdid(df = baker, window = c(-20, 20), y = "y",
      unit_index = "id", time_index = "year",
      treat_status = "treat_status", reweight = TRUE, pooled = TRUE) -> reg
reg$coeftable
#>    Estimate Std. Error  t value Pr(>|t|)
#> 21 90.09827  0.6896807 130.6377        0

Non-Absorbing Treatment

A final feature of the package (that is still under construction), is handling non-absorbing treatment. The treatment example used in Dube et al. (2023) is changes in the minimum wage. It is impossible to find a “clean control” since all states have increased the minimum wage at one point or another. So, the authors (see also Cengiz et al., 2019) make an assumption that \(L\) time periods after treatment, the effect “stabilizes”. See the text for more details.

Below, we simulate data where there are three groups. First, there is an untreated group. Second, there is a group that gets treated only a single time. Finally, there is a group that is treated twice. The effect sizes are 15 and 35 for first and second treatments. The aggregated time series plots can be seen below.

set.seed(757)
rm(list = ls())
I <- 50; bigT <- 50; N <- I * bigT; e_var <- 4
data.frame(id = rep(1:I, each = bigT),
           t = rep(1:bigT, I),
           e = rnorm(N, 0, e_var)) -> df
unit_FE <- rnorm(I, 0, e_var)
time_FE <- rnorm(bigT, 0, e_var)
df$treated <- ifelse(df$id < 21, 1, 0)
df$treat_status <- ifelse(df$id < 21 & df$t == 16, 1, 0)
df$treat_status <- ifelse(df$id < 11 & df$t == 36, 1, df$treat_status)
ifelse(df$id < 11,
       ifelse(df$t > 35, 50,
              ifelse(df$t > 15, 15, 0)),
       ifelse(df$id < 21,
              ifelse(df$t > 15, 15, 0),
              0)) -> df$beta
df$y <- df$e + unit_FE[df$id] + time_FE[df$t] + df$beta
df$group <- ifelse(df$id < 11, 1, ifelse(df$id < 21, 2, 3))

agg <- aggregate(df$y, list(df$t, df$group), mean)
par(mar = c(4.1, 4.1, .1, 4.1))
plot(agg$Group.1, agg$x, col = agg$Group.2 + 1, pch = 19,
     xlab = "Time", ylab = "Outcome")

Notice the difference in this function call versus the previous ones. First, there is no rel_time argument needed. This is because the function will generate rel_time by itself. However, you must include nonabsorbing = TRUE in addition to a nonabsorbing_lag and nonabsorbing_treat_status. nonabsorbing_lag represents the number of periods until the dynamic treatment effects stabilize. Then, nonabsorbing_treat_status takes on values of 1 each time a unit gets treated.

lpdid(df = df, window = c(-12, 12), y = "y",
      unit_index = "id", time_index = "t",
      nonabsorbing_lag = 5,
      treat_status = "treat_status") -> reg1
lpdid(df = df, window = c(-12, 12), y = "y",
      unit_index = "id", time_index = "t",
      nonabsorbing_lag = 5,
      treat_status = "treat_status",
      reweight = TRUE) -> reg2
par(mar = c(4.1, 4.1, .1, 4.1))
plot_lpdid(reg1, x.shift = -0.05, col = "dodgerblue")
plot_lpdid(reg2, x.shift = 0.05, col = "tomato", add = T)
abline(h = mean(c(15, 15, 35)), lty = 2)
legend("topleft", legend = c("VWATT", "EWATT"), bty = "n",
       col = c("dodgerblue", "tomato"), pch = 19)

Under Construction

This is the first iteration of this package. So, of course, there are some known (and surely many unknown) bugs. Some of them are as follows:

Please direct any questions or comments to Alex and all complaints to Zach.


  1. This function is under development. Currently, the parameters are fixed, so the only options are the exogeneity of the treatment timing and the random number seed. The intention is to allow for full control over the parameters so users can experiment.↩︎