The Effect of 401k Eligibility on Financial Wealth

In this brief illustration, we use data from the 1991 wave of the Survey of Income and Program Participation (SIPP), originally analyzed by Poterba, Venti & Wise (1995), to study how the 401(k) retirement plan affects individuals’ net financial assets.

Data preparations

The main variables of interest are:

Variable Description
net_tfa Net total financial assets (outcome)
e401 401(k) eligibility
p401 401(k) participation
age Age of the head of the household
tw Total wealth
inc Household income
fsize Household size
db Defined benefit pension status indicator
marr Married indicator
twoearn Two-earner status indicator
pira IRA participation indicator
hown House ownership indicator

We load the data into R/Stata and prepare the data.

Code

To illustrate the implementation of DML in Stata, we use the ddml package. Note that you will also need to install pystacked, which implements a wide range of machine learners for Stata and uses Python’s sklearn internally. For pystacked to run, you will need to set up Stata’s Python integration, see instructions here.

clear all
set more off
set seed 20241111

use "https://dmlguide.github.io/assets/dta/PVW_data.dta", clear

global Y net_tfa
global X age tw inc fsize db marr twoearn pira hown
global D p401
global Z e401

To illustrate the implementation of DML in R, we use the ddml package. Another popular alternative is DoubleML, see here.

library(haven)
library(ddml)
library(ranger)
library(tidymodels)

dat <- read_dta("https://dmlguide.github.io/assets/dta/PVW_data.dta")

set.seed(20241111)

# Define control variables
control_names <- c("age", "tw", "inc", "fsize", "db", "marr",
                   "twoearn", "pira", "hown")

# Extract variables
y      <- dat$net_tfa
D      <- dat$e401     # eligibility
D_part <- dat$p401     # participation
Z      <- dat$e401     # instrument
X      <- as.matrix(dat[, control_names])

To illustrate the implementation of DML in Python, we use the DoubleML package, which integrates directly with scikit-learn learners.

import numpy as np
import pandas as pd
from doubleml import (DoubleMLData, DoubleMLPLR, DoubleMLIRM,
                      DoubleMLPLIV, DoubleMLIIVM)
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.base import clone

dat = pd.read_stata("https://dmlguide.github.io/assets/dta/PVW_data.dta")

np.random.seed(20241111)

# Define control variables
control_names = ["age", "tw", "inc", "fsize", "db", "marr",
                 "twoearn", "pira", "hown"]

Model specification

Before we can run the DML estimation, we need to select a nuisance function estimator. This could be linear regression or almost any supervised machine learner (satisfying a mild congerence rate requirement).1 While the choice of the machine learner is important and can affect the estimation results, we focus here for illustration on random forests with 1000 trees. For the outcome equation, we use a maximum tree depth of 8; for other nuisance functions we opt for a a maximum tree depth of 4.

Code

We will save the hyper-parameters in a global macro which we use below.

// Random forest with 1000 trees: depth 8 for the outcome equation,
// depth 4 for the remaining nuisance functions.
global rfoutcome n_estimators(1000) max_depth(8)
global rfother   n_estimators(1000) max_depth(4)

Learner settings

ddml::mdl_ranger() is a wrapper of ranger::ranger(). To learn which tuning parameter you can set, check the help file with ?ddml::mdl_ranger. A number of other wrapper commands are included in the ddml package and you can also write your own wrapper commands, see this guide.

# Random forest with 1000 trees: depth 8 for the outcome equation,
# depth 4 for the remaining nuisance functions.
learners <- list(
  list(fun = mdl_ranger,
       args = list(num.trees = 1000,
                   max.depth = 8))
)

learners_DX <- list(
  list(fun = mdl_ranger,
       args = list(num.trees = 1000,
                   max.depth = 4))
)

Learner settings

DoubleML accepts any scikit-learn estimator as a nuisance learner. We pass RandomForestRegressor for continuous targets and RandomForestClassifier for the binary treatment / instrument.

# Random forest with 1000 trees: depth 8 for the outcome equation,
# depth 4 for the remaining nuisance functions. We use regressors for
# continuous targets and classifiers for the binary treatment / instrument.
ml_outcome_reg = RandomForestRegressor(n_estimators=1000, max_depth=8,
                                       random_state=20241111, n_jobs=-1)
ml_other_reg   = RandomForestRegressor(n_estimators=1000, max_depth=4,
                                       random_state=20241111, n_jobs=-1)
ml_other_cls   = RandomForestClassifier(n_estimators=1000, max_depth=4,
                                        random_state=20241111, n_jobs=-1)

The effect of 401k elligibility

We replicate the DML estimate of the average treatment effects of 401k elligibility reported in Table 1, last column. Using the above random forest learners, we now estimate two different target paramters with DML: the partially linear regression coefficient and the average treatment effect.

Code
// PLR: partially linear regression coefficient
ddml init partial, kfolds(10)
ddml E[Y|X]: pystacked $Y $X , type(reg) methods(rf) cmdopt1($rfoutcome)
ddml E[D|X]: pystacked $Z $X , type(reg) methods(rf) cmdopt1($rfother)
ddml crossfit
eststo m_plr: ddml estimate, robust

// ATE: average treatment effect (interactive model)
ddml init interactive, kfolds(10)
ddml E[Y|X,D]: pystacked $Y $X , type(reg)   methods(rf) cmdopt1($rfoutcome)
ddml E[D|X]:   pystacked $Z $X , type(class) methods(rf) cmdopt1($rfother)
ddml crossfit
eststo m_ate: ddml estimate, robust trim(0.001)

esttab m_plr m_ate, mtitles("PLR" "ATE") ///
    b(%9.2f) se(%9.2f) star(* 0.05 ** 0.01 *** 0.001) ///
    stats(N, fmt(%9.0f) labels("Num. obs."))

We use ensemble_type = "average" to disable stacking.

# DML estimation of the partially linear regression (PLR) coefficient
plm_fit <- ddml_plm(y, D, X,
                    learners      = learners,
                    learners_DX   = learners_DX,
                    sample_folds  = 10,
                    ensemble_type = "average")

# DML estimation of the average treatment effect (ATE)
ate_fit <- ddml_ate(y, D, X,
                    learners      = learners,
                    learners_DX   = learners_DX,
                    sample_folds  = 10,
                    ensemble_type = "average",
                    trim          = 0.001)

# Tidy comparison of the two estimands using ddml::tidy()
elig_tbl <- bind_rows(
  PLR = tidy(plm_fit) |> filter(term == "D1"),
  ATE = tidy(ate_fit),
  .id = "model"
) |> as_tibble()
print(elig_tbl)

# Build the DoubleMLData object (treatment = e401)
dml_data_e = DoubleMLData(dat, y_col="net_tfa", d_cols="e401",
                          x_cols=control_names)

# PLR: partially linear regression coefficient
plr_fit = DoubleMLPLR(dml_data_e,
                      ml_l=clone(ml_outcome_reg),
                      ml_m=clone(ml_other_reg),
                      n_folds=10)
plr_fit.fit()

# ATE: average treatment effect (interactive regression model)
irm_fit = DoubleMLIRM(dml_data_e,
                      ml_g=clone(ml_outcome_reg),
                      ml_m=clone(ml_other_cls),
                      n_folds=10,
                      trimming_threshold=0.001)
irm_fit.fit()

# Side-by-side comparison of the two estimands
def compare(fits, labels, row_label, n_obs):
    fmt = lambda x: f"{x:>12.2f}"
    rows = [
        f"{row_label:<14}" + "".join(f"{l:>12}" for l in labels),
        f"  estimate   " + "".join(fmt(f.coef[0])    for f in fits),
        f"  (SE)       " + "".join(fmt(f.se[0])      for f in fits),
        f"  p-value    " + "".join(fmt(f.pval[0])    for f in fits),
        f"  Num. obs.  " + "".join(f"{n_obs:>12d}"   for _ in fits),
    ]
    return "\n".join(rows)

print(compare([plr_fit, irm_fit], ["PLR", "ATE"],
              "eligibility", dml_data_e.n_obs))

Output
--------------------------------------------
                      (1)             (2)   
                      PLR             ATE   
--------------------------------------------
e401              7061.52***      6099.18***
                 (857.95)        (780.94)   

_cons                1.11                   
                 (338.38)                   
--------------------------------------------
Num. obs.            9915            9915   
--------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

# A tibble: 2 × 7
  model term  estimate std.error statistic  p.value ensemble_type
  <chr> <chr>    <dbl>     <dbl>     <dbl>    <dbl> <chr>        
1 PLR   D1       7826.      907.      8.63 6.10e-18 average      
2 ATE   ATE      6901.      771.      8.95 3.63e-19 average

eligibility            PLR         ATE
  estimate        6737.64     6344.02
  (SE)             909.32      719.96
  p-value            0.00        0.00
  Num. obs.          9915        9915

The effect of 401k participation

The data also includes information on whether individuals signed up to the 401k scheme (p401). This allows us to estimate two other parameters: the local average treatment effect and the partially linear IV regression coefficient. In both cases, we leverage e401 as an instrument and assume that e401 is exogenous conditional on the set of covariates.

Code
// PLIV: partially linear IV regression coefficient
ddml init iv, kfolds(10)
ddml E[Y|X]: pystacked $Y $X , type(reg) methods(rf) cmdopt1($rfoutcome)
ddml E[D|X]: pystacked $D $X , type(reg) methods(rf) cmdopt1($rfother)
ddml E[Z|X]: pystacked $Z $X , type(reg) methods(rf) cmdopt1($rfother)
ddml crossfit
eststo m_pliv: ddml estimate, robust

// LATE: local average treatment effect (interactive IV model)
ddml init interactiveiv, kfolds(10)
ddml E[Y|X,Z]: pystacked $Y $X , type(reg)   methods(rf) cmdopt1($rfoutcome)
ddml E[D|X,Z]: pystacked $D $X , type(class) methods(rf) cmdopt1($rfother)
ddml E[Z|X]:   pystacked $Z $X , type(class) methods(rf) cmdopt1($rfother)
ddml crossfit
eststo m_late: ddml estimate, robust trim(0.001)

esttab m_pliv m_late, mtitles("PLIV" "LATE") ///
    b(%9.2f) se(%9.2f) star(* 0.05 ** 0.01 *** 0.001) ///
    stats(N, fmt(%9.0f) labels("Num. obs."))

# DML estimation of the partially linear IV regression (PLIV) coefficient
pliv_fit <- ddml_pliv(y, D_part, Z, X,
                      learners      = learners,
                      sample_folds  = 10,
                      ensemble_type = "average")

# DML estimation of the local average treatment effect (LATE)
late_fit <- ddml_late(y, D_part, Z, X,
                      learners      = learners,
                      sample_folds  = 10,
                      ensemble_type = "average",
                      trim          = 0.001)

# Tidy comparison of the two estimands using ddml::tidy()
part_tbl <- bind_rows(
  PLIV = tidy(pliv_fit) |> filter(term == "D1"),
  LATE = tidy(late_fit),
  .id = "model"
) |> as_tibble()
print(part_tbl)

# Build the DoubleMLData object with e401 as instrument for p401
dml_data_p = DoubleMLData(dat, y_col="net_tfa", d_cols="p401",
                          z_cols="e401", x_cols=control_names)

# PLIV: partially linear IV regression coefficient
pliv_fit = DoubleMLPLIV(dml_data_p,
                        ml_l=clone(ml_outcome_reg),
                        ml_m=clone(ml_other_reg),
                        ml_r=clone(ml_other_reg),
                        n_folds=10)
pliv_fit.fit()

# LATE: local average treatment effect (interactive IV model)
iivm_fit = DoubleMLIIVM(dml_data_p,
                        ml_g=clone(ml_outcome_reg),
                        ml_m=clone(ml_other_cls),
                        ml_r=clone(ml_other_cls),
                        n_folds=10,
                        trimming_threshold=0.001)
iivm_fit.fit()

print(compare([pliv_fit, iivm_fit], ["PLIV", "LATE"],
              "participation", dml_data_p.n_obs))

Output
--------------------------------------------
                      (1)             (2)   
                     PLIV            LATE   
--------------------------------------------
p401              9896.56***      9313.53***
                (1250.05)       (1148.57)   

_cons             -123.79                   
                 (335.92)                   
--------------------------------------------
Num. obs.            9915            9915   
--------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

# A tibble: 2 × 7
  model term  estimate std.error statistic  p.value ensemble_type
  <chr> <chr>    <dbl>     <dbl>     <dbl>    <dbl> <chr>        
1 PLIV  D1      10746.     1323.      8.12 4.59e-16 average      
2 LATE  LATE     9611.     1205.      7.98 1.51e-15 average

participation         PLIV        LATE
  estimate       10375.66     9343.24
  (SE)            1290.76     1045.90
  p-value            0.00        0.00
  Num. obs.          9915        9915

We have demonstrated that we can target a wide range of target parameters using DML estimators. However, we shouldn’t take the results too seriously at this stage as we should validate our results by considering alternative nuisance function estimators, change the number of cross-fitting folds and repeat the cross-fitting procedure. We will do this in the next application, see Understanding Cultural Persistence and Change.

  1. See Section 3 in the paper.