UC Santa Barbara
2025-04-23
Elizabeth Stuart
Statistician, Hopkins
Avi Feller
Statistician, UC Berkeley
Alison Gemill
Demographer, Hopkins
Suzanne Bell
Demographer, Hopkins
David Arbour
Statistician, Adobe
Eli Ben-Michael
Statistician, Carnegie Mellon
Texas Senate Bill 8
Effectively bans abortion
Sept. 1, 2021
Roe v. Wade overturned
June 24, 2022
States with abortion bans experienced an average 2.3% increase in births in first half of 2023 (Dench, Pineda-Torres, and Myers 2024)
By race/ethnicity: greater impact among non-Hispanic Black and Hispanic individuals (Dench, Pineda-Torres, and Myers 2024; Caraher 2024) and greater impact among 20-24-year-olds (Dench, Pineda-Torres, and Myers 2024)
~13% increase in infant deaths; 8% increase in the infant mortality rate (Gemmill et al. 2024)
To estimate sociodemographic variation in the impact of abortion bans on subnational birth rates in the US through the end of 2023
Assumptions:
Well-defined exposure: {any complete or 6-week abortion ban} vs {no ban}
No anticipation: no effect of abortion restrictions prior to exposure
No spillovers across states: outcomes only depend on own state’s policy
Some common strategies:
Summing infant deaths over subroups yields total infant deaths
Inferred total infant mortality rates by differ depending on which subgroups are considered
Better to estimate the total effect by estimating the subgroup effects and summing or modeling the total effect directly?
In these states population white ≈ 5-15x population black
Pre-treatment balance should depend on state and subgroup size
Avoid overfitting to noise when groups are small
The difference between realized and counterfactual infant deaths, \(Y_{it}(1) - Y_{it}(0)\), will be more variable for smaller states and subgroups
Suggests a need to regularize causal effect estimates
Want to encourage estimated infant mortality rates to be similar for the same state or same subgroup, while still allowing for the possibility of differences
\[ \begin{align} Y_{ijt}(1) &\sim \text{Poisson}(\tau_{ijt} \cdot \rho_{ijt} \cdot B_{ijt})\\ Y_{ijt}(0) &\sim \text{Poisson}(\rho_{ijt} \cdot B_{ijt}) \end{align} \] for unit \(i\), subgroup \(j\), time \(t\)
We assume the infant mortality rate in the “no ban” condition can be expressed as
\[\rho_{ijt} = \alpha_{ij}^{\text{state}} \cdot \alpha_{jt}^{\text{time}} \cdot \left(\sum_{k=1}^K \lambda_{ijk}\eta_{jkt}\right),\]
Partially pool the exposure parameters \(\tau_{ijt}\) across states and across subcategories, with state and subcategory prior distributions centered at zero:
\[ \begin{align} \log(\tau_{ijt}) &\sim N\left(\beta_{ij}^{\text{state,sub}}, \sigma_\tau\right)\\ \beta_{ij}^{\text{state,sub}} &\sim N\left(\beta_i^{\text{state}} + \beta_j^{\text{sub}}, \sigma_\beta\right)\\ \beta_i^{\text{state}} &\sim N\left(0, \sigma_{\text{state}}\right)\\ \beta_j^{\text{sub}} &\sim N\left(0, \sigma_{\text{sub}}\right) \end{align} \]
Model implemented in probabilistic programming library, numpyro
MCMC inference with Hamiltonian Monte Carlo
Run multiple chains, check Rhats and effective sample sizes
Fit models for each category
For each, fit models for multiple latent ranks and check fit
Code available at:
Posterior predictive checks are used to assess how well a Bayesian model fits observed data
Unlike classical hypothesis testing, posterior predictive checks focus on practical significance of model inadequacies
\(\mathbb{P}(T^{\text{pred}} > T^{\text{obs}} \mid Y) = \int \mathbb{P}(T^{\text{pred}} > T^{\text{obs}} \mid Y, \theta) \mathbb{P}(\theta \mid Y) d\theta\) should be far from 0 and 1.
Maximum absolute residual: identify outliers inconsistent with the model: \(T_{ij} = \tau_{ij} = \max_{t} | r_{ijy}|\)
Residual autocorrelation: check for remaining autocorrelation after controlling latent factors (and seasonal trends)
Across-unit correlation: states should be uncorrelated after controlling for latent factors:
In banned states overall, the infant mortality rate increased by 5.6%
Slightly smaller than prior studies
Similar in magnitude of recent population-wide events
Largest impacts among those experiencing greatest structural disadvantage (consistent across states)
Missing data and staggered adoption are easier to handle with Bayesian models
Hierarchical modeling of the treatment effect in panel data is an underexplored strategy for estimating heterogeneous treatment effects
Choice of temporal aggregation is important and tied to the amount of missingness
More work needed to understand how and when to disaggregate when inferring total effects
Papers published in JAMA. See Gemmill et al. (2025) and Bell et al. (2025). Supplementary materials contain modeling details.
Note: missingness depends on level of temporal aggregation
Range: 0.6% - 2.1%
Overall: +1.7%
Non-Texas: +0.9%
Let \(M_{ijt}\) denote the indicator for suppressed counts, with \(M_{ijt}=1\) if \(0 < Y_{ijt}^{\text{obs}} < 10\) and \(M_{ijt}=0\) otherwise. % If we let \(B_{ijt}^{obs} = B_{ijt}(G_i)^{D_{ijt}}B_{ijt}(\infty)^{(1-D_{ijt})}\) then, The observed data likelihood can then be written as: \[\begin{align} \label{eq:obs_data_likelihood} \mathbb{P}(\mathbf{Y}^{obs}, \mathbf{M} \mid \mathbf{B}^{obs}, \mathbf{D}, \rho, \tau) =& \prod_{ijt}\left[ ((1-P_{\text{miss}}(\rho_{ijt}B_{ijt}^{obs}))\text{Pois}(Y_{ijt}; \rho_{ijt}B_{ijt}^{obs}))^{(1-M_{ijt})(1-D_{ijt})}\right. \times \\ &((1-P_{\text{miss}}(\tau_{ijt}\rho_{ijt}B_{ijt}^{obs}))~\text{Pois}(Y_{ijt}; \tau_{ijt}\rho_{ijt}B_{ijt}^{obs} ))^{{(1-M_{ijt})D_{ijt}}}\times \\ &\left.(P_{\text{miss}}(\rho_{ijt}B_{ijt}^{obs})^{M_{ijt}(1-D_{itj})}(P_{\text{miss}}(\tau_{ijt}\rho_{ijt}B_{ijt}^{obs})^{M_{ijt}D_{itj}})\right]. \end{align}\] where \(\text{Pois}(Y_{ijt}; \rho_{ijt}B_{ijt}^{obs})\) is the poisson PMF with mean \(\rho_{ijt}B_{ijt}^{obs}\) evaluated at \(Y_{ijt}\); and \[P_{\text{miss}}(\rho_{ijt}B_{ijt}^{obs}) = (F(9; \rho_{ijt}B_{ijt}^{obs}) - F(0; \rho_{ijt}B_{ijt}^{obs})), \] where \(F(a; \mu)\) is the CDF of a Poisson with mean \(\mu\) evaluated at \(a\) so that \(P_{\text{miss}}(\mu_{ijt}) = F(9; \mu) - F(0; \mu)\) is the probability of observing a missing count between 1 and 9, inclusive.
RAND - Stat Group Seminar