Quantifying Heterogeneity in the Causal Impact of Abortion Restrictions

Research Team

Elizabeth Stuart
Statistician, Hopkins

Avi Feller
Statistician, UC Berkeley

Alison Gemill
Demographer, Hopkins

Suzanne Bell
Demographer, Hopkins

David Arbour
Statistician, Adobe

Eli Ben-Michael
Statistician, Carnegie Mellon

Texas Senate Bill 8
Effectively bans abortion
Sept. 1, 2021

Roe v. Wade overturned
June 24, 2022

Abortion Bans Across the US

US Infant Mortality Rates

Source: Washington Post, August 18, 2022

Early Evidence on Impacts

States with abortion bans experienced an average 2.3% increase in births in first half of 2023 (Dench, Pineda-Torres, and Myers 2024)
By race/ethnicity: greater impact among non-Hispanic Black and Hispanic individuals (Dench, Pineda-Torres, and Myers 2024; Caraher 2024) and greater impact among 20-24-year-olds (Dench, Pineda-Torres, and Myers 2024)
~13% increase in infant deaths; 8% increase in the infant mortality rate (Gemmill et al. 2024)

Study Objectives

To estimate sociodemographic variation in the impact of abortion bans on subnational birth rates in the US through the end of 2023
- By age, race/ethnicity, marital status, educational attainment, insurance type

To estimate variation in the impact of abortion bans on subnational infant mortality in the US through the end of 2023
- By race/ethnicity, timing of death, cause of death

Fertility Trends

Infant Mortality Trends

Overall Analytic Approach

Today: focus methods discussion on infant mortality data
Models for the fertility data are very similar
Bayesian panel data approach

Poisson latent factor model
- Fertility: model bimonthly number of births with population offset
- Infant mortality: model biannual number of deaths with live birth offset
Model state-by-subgroup-specific impacts separately by characteristic
States without bans and pre-exposure outcomes in all states inform counterfactual

Infant Mortality Approach

Outcome: infant mortality rate (deaths per 1,000 live births)
Exposure: 6-week or complete abortion ban (14 states¹), staggered adoption
Pre-policy period: January 2012 through ~December 2022
Treated period: ~January 2023 through December 2023

Subgroups
- Race/ethnicity: non-Hispanic White, non-Hispanic Black, Hispanic, and Other
- Timing: neonatal (<28 days), non-neonatal
- Cause of death: congenital, non-congenital

Panel Data

Panel with $n$ states and $T$ time periods
Potential outcomes $Y_{i t} (0)$ , $Y_{i t} (1)$ and a binary exposure indicator $W_{i t} \in {0, 1}$
We observe for each unit the pair $Y_{i}, W_{i}$ where $Y_{i t} \equiv Y_{i t} (W_{i t}) = {\begin{cases} Y_{i t} (0) & if W_{i t} = 0 \\ Y_{i t} (1) & if W_{i t} = 1 \end{cases}$

Causal Inference for Panel Data

Assumptions:

Well-defined exposure: {any complete or 6-week abortion ban} vs {no ban}
No anticipation: no effect of abortion restrictions prior to exposure
No spillovers across states: outcomes only depend on own state’s policy

Causal Inference for Panel Data

Some common strategies:

Interrupted Time Series (horizontal)
Synthetic Control Methods and Factor Models (vertical)
Differences in Differences(DID) and Two-Way-Fixed-Effects (TWFE)

Challenges with Infant Death Data

Infant death counts are small and discrete
Missing data: CDC Wonder excludes counts between 1 and 9
- Implications for level of temporal aggregation
States and subgroups vary in size and mortality rates
Staggered Adoptions
- Bans were imposed at different times

Temporal Aggregation

Missingness → CDC Wonder suppresses counts 1, …, 9 (but not 0!)
- e.g., annual → no missingness; daily → high missingness
- Later: imputation approach
Noise → noise for (avg) annual counts ≪ (avg) monthly counts (see Sun, Ben-Michael, and Feller 2024)
- Further complicated by seasonality
Fertility → 2 month intervals (e.g., Jan-Feb 2023)
Mortality → 6 month intervals (e.g., Jan-June 2023)

Subgroup Inference

Summing infant deaths over subroups yields total infant deaths
Inferred total infant mortality rates by differ depending on which subgroups are considered
Better to estimate the total effect by estimating the subgroup effects and summing or modeling the total effect directly?

State Size and Sampling Variance

Subgroup Size and Variability

In these states population white ≈ 5-15x population black

Implications

Pre-treatment balance should depend on state and subgroup size
Avoid overfitting to noise when groups are small
The difference between realized and counterfactual infant deaths, $Y_{i t} (1) - Y_{i t} (0)$ , will be more variable for smaller states and subgroups
Suggests a need to regularize causal effect estimates
Want to encourage estimated infant mortality rates to be similar for the same state or same subgroup, while still allowing for the possibility of differences

A Probabilistic Bayesian Model

Explicitly incorporate a missing data model
Staggered adoption accounted for in the likelihood
Count data modeled via Poisson with offset based on state/group size
Hierarchical prior stabilize treatment effect estimates and partially pool effects by state and category
Uncertainty quantification for “free”

Panel Model for Infant Deaths

$\begin{aligned} Y_{i j t} (1) & \sim Poisson (τ_{i j t} \cdot ρ_{i j t} \cdot B_{i j t}) \\ Y_{i j t} (0) & \sim Poisson (ρ_{i j t} \cdot B_{i j t}) \end{aligned}$ for unit $i$ , subgroup $j$ , time $t$

$B_{i j t}$ is births (in thousands)
- Scales mortality rate to account for variability in state size
$ρ_{i j t}$ is the infant mortality rate without bans
$τ_{i j t} ρ_{i j t}$ is the infant mortality rate with bans
$τ_{i j t}$ is the multiplicate change in infant mortality rate due to bans

Poisson Latent Factor Model

We assume the infant mortality rate in the “no ban” condition can be expressed as

$ρ_{i j t} = α_{i j}^{state} \cdot α_{j t}^{time} \cdot (\sum_{k = 1}^{K} λ_{i j k} η_{j k t}),$

$α_{i j}^{state}$ and $α_{j t}^{time}$ are state and time-specific intercept
$η_{j k t} \in R^{+}$ is the $k$ th latent factor at time t, common to all states but unique to subcategory j
$λ_{i j .} \sim Dirichlet$ are the factor loadings for state i and category j
Model selection problem: choosing $K$ (rank)

Hierarchical Prior on Causal Effects

Partially pool the exposure parameters $τ_{i j t}$ across states and across subcategories, with state and subcategory prior distributions centered at zero:

$\begin{aligned} \log (τ_{i j t}) & \sim N (β_{i j}^{state,sub}, σ_{τ}) \\ β_{i j}^{state,sub} & \sim N (β_{i}^{state} + β_{j}^{sub}, σ_{β}) \\ β_{i}^{state} & \sim N (0, σ_{state}) \\ β_{j}^{sub} & \sim N (0, σ_{sub}) \end{aligned}$

Shrinkage Across States

Shrinkage Across Subcategories

Variation Across Multiple Sources

MCMC Inference

Model implemented in probabilistic programming library, numpyro
MCMC inference with Hamiltonian Monte Carlo
Run multiple chains, check Rhats and effective sample sizes

MCMC Inference

Fit models for each category
- Mortality: Total, race/ethnicity, timing of death and type of death
- Fertility: Total, age, race/ethnicity, education, insurance
For each, fit models for multiple latent ranks and check fit
Code available at:
- github.com/afranks86/dobbs_fertility
- github.com/afranks86/dobbs_infant_mortality

Model Selection and Checking

In-sample checks:
- Question: how well does the model fit the observed data
- Tool: gap plots and posterior predictive comparisons
- Used to select latent factor rank
Out-of-sample checks
- Question: how well can we forecast
- Tool: placebo-in-time checks

Results - Texas

Posterior Predictive Checks

Posterior predictive checks are used to assess how well a Bayesian model fits observed data
Unlike classical hypothesis testing, posterior predictive checks focus on practical significance of model inadequacies
$P (T^{pred} > T^{obs} ∣ Y) = \int P (T^{pred} > T^{obs} ∣ Y, θ) P (θ ∣ Y) d θ$ should be far from 0 and 1.

Posterior Predictive Checks

Maximum absolute residual: identify outliers inconsistent with the model: $T_{i j} = τ_{i j} = max_{t} | r_{i j y} |$
Residual autocorrelation: check for remaining autocorrelation after controlling latent factors (and seasonal trends)
- Test statistic based on residual autocorrelation at different lags
- $T_{i j} = c o r (r_{i j t}, r_{i, j, t + l})$

PPC: Max Residual

Posterior Predictive Checks

Across-unit correlation: states should be uncorrelated after controlling for latent factors:

Test statistic based on eigenspectrum of residual correlation matrix

Let $C = (c_{i i^{'}})$ where $c_{i i^{'}} =$ cor( $r_{i \cdot}, r_{i^{'} \cdot}$ )
$T = σ_{m a x} (C)$ where $σ_{m a x} (C)$ is the largest singular of $C$ .
T should be small for uncorrelated state-residuals

PPC: State Correlations

Placebo-in-Time

Fertility Impact by Subgroup

+1.7% overall increase

State-Specific Effects on Inf. Mortality

In banned states overall, the infant mortality rate increased by 5.6%

Kentucky: +7.5%
Texas: +8.9%

Effect on Infant Mortality by Cause

+10.9% increase in congenital deaths
+4.2% increase in non-congenital deaths

Note: majority of deaths attributable to the bans are non-congenital

Effect on Infant Mortality by Race/Ethnicity

NH White: +5.1%
NH Black: +11.0%
Hispanic: +3.3%
NH Other: +9.9%

Key Findings

Strong evidence that birth rates increased above expectation in states that banned abortion (+1.6%)
- Slightly smaller than prior studies
- Similar in magnitude of recent population-wide events
- Largest impacts among those experiencing greatest structural disadvantage (consistent across states)

Infant mortality increased in states with bans (+5.5%)
- Outsized influence of Texas
- Double the impact among non-Hispanic Black infants
- Larger relative increase among congenital deaths

Implications

Profound health, social and economic implications of being unable to obtain an abortion (Greene Foster 2020)
State-specific policies and social contexts may present additional barriers for disadvantaged women
Bans exacerbate existing health disparities
Future work: impact of abortion bans on maternal morbidity, high-risk pregnancy care, and birth outcomes (e.g., preterm birth, low birthweight)

Methodological Takeaways

Missing data and staggered adoption are easier to handle with Bayesian models
Hierarchical modeling of the treatment effect in panel data is an underexplored strategy for estimating heterogeneous treatment effects
Choice of temporal aggregation is important and tied to the amount of missingness
More work needed to understand how and when to disaggregate when inferring total effects

Publications

Papers published in JAMA. See Gemmill et al. (2025) and Bell et al. (2025). Supplementary materials contain modeling details.

Thank you!

Bell, Suzanne O, Alexander M Franks, David Arbour, Selena Anjur-Dietrich, Elizabeth A Stuart, Eli Ben-Michael, Avi Feller, and Alison Gemmill. 2025. “US Abortion Bans and Fertility.” JAMA.

Caraher, Raymond. 2024. “Do Abortion Bans Affect Reproductive and Infant Health? Evidence from Texas’s 2021 Ban and Its Impact on Health Disparities.” Political Economy Research Institute Working Paper No 606.

Dench, Daniel, Mayra Pineda-Torres, and Caitlin Myers. 2024. “The Effects of Post-Dobbs Abortion Bans on Fertility.” Journal of Public Economics 234: 105124.

Gemmill, Alison, Alexander M. Franks, Selena Anjur-Dietrich, Amy Ozinsky, David Arbour, Elizabeth A. Stuart, Eli Ben-Michael, Avi Feller, and Suzanne O. Bell. 2025. “US Abortion Bans and Infant Mortality.” JAMA 333 (15): 1315–23. https://doi.org/10.1001/jama.2024.28517.

Gemmill, Alison, Claire E Margerison, Elizabeth A Stuart, and Suzanne O Bell. 2024. “Infant Deaths After Texas’ 2021 Ban on Abortion in Early Pregnancy.” JAMA Pediatrics 178 (8): 784–91.

Sun, Liyang, Eli Ben-Michael, and Avi Feller. 2024. “Temporal Aggregation for the Synthetic Control Method.” In AEA Papers and Proceedings, 114:614–17. American Economic Association 2014 Broadway, Suite 305, Nashville, TN 37203.