The theory behind this effort is that legal drinking at age 18 discourages binge drinking and promotes a culture of mature alcohol consumption.
It splits who can drink and who can’t.
Regression Discontinuity Design (RDD)
The `name´ RDD comes from a jump, a discontinuity that occurs in a continuous variable.
In its simplest form, the design has a:
The assignment variable (e.g., age),
Two groups (above and below the cutoff),
The outcome variable.
You may include nonlinearities and control variables.
Regression Discontinuity Design (RDD)
The main assumption that allows using RDD as a causal method is that
Next to the cut, the participants are similar. The only difference is that one individual is in each of the “sides”.Source
Regression Discontinuity Design (RDD)
The cutoff value occurs at 50
What are the differences between someone that scores 49.99 and 50.01 in the X variable?
The intuition is that these individuals are similar and comparable.
In the absence of treatment, the assumption is that the solid line would “continue” with the same inclination and values.
There is a discontinuity, however. This implies that the pretreatment in the absence of the treatment should be the dashed line.
The discontinuity is the causal effect of X (at the cutoff) to Y.
Unlike the matching and regression strategies based on treatment-control comparisons conditional on covariates, the validity of RDD is based on our willingness to extrapolate across values of the running variable, at least for values in the neighborhood of the cutoff at which treatment switches on.MM
library(readxl)library(ggplot2)data <-read_excel("files/RDD.xlsx")data$treated <-0data$treated[data$x >=101] <-1cut <-100band <-50xlow = cut - bandxhigh = cut + banddata <-subset(data, x > xlow & x <= xhigh, select=c(x, y, treated))# Generating xhat - Now we are going to the RDDdata$xhat <- data$x - cut# Generating xhat * treated to allow different inclinations (we will use the findings of the last graph, i.e. that each group has a different trend.)data$xhat_treated <- data$xhat * data$treated# RDD Assuming different trendsrdd <-lm(y ~ xhat + treated + xhat_treated, data = data)summary(rdd)
Call:
lm(formula = y ~ xhat + treated + xhat_treated, data = data)
Residuals:
Min 1Q Median 3Q Max
-12.9477 -3.2607 0.6875 3.2227 12.2004
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 55.83059 1.53681 36.329 < 2e-16 ***
xhat 0.29431 0.05405 5.445 3.97e-07 ***
treated 28.93921 2.20672 13.114 < 2e-16 ***
xhat_treated -0.51587 0.07644 -6.749 1.13e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.515 on 96 degrees of freedom
Multiple R-squared: 0.8942, Adjusted R-squared: 0.8909
F-statistic: 270.3 on 3 and 96 DF, p-value: < 2.2e-16
The coefficient of x before the cut is 0.29 (t-stat 5.45), and after the cut, it is -0.51 (t-stat -6.75).
We also have the coefficient of the treatment, which is measured by the “jump” that occurs near the cut: an estimated coefficient of 28.9 (t-stat 13.11).
If this were a real example, this would be the causal effect of receiving the treatment (i.e., being beyond the cut).