← All topics

Causal Inference

Potential outcomes, DAGs, do-calculus, identification, and backdoor criterion.

Evidence briefs

Reviewed claims

Claim-level summaries connect a practical takeaway to the papers that actually support it.

High confidencePublished

Randomized assignment positive Unbiasedness of average treatment effect estimates

Randomized assignment yields unbiased estimates of the average causal effect because treatment is independent of potential outcomes, eliminating confounding.

Population: Causal inference studies with any treatment and outcome · Comparator: Nonrandomized assignment

Primary evidence

Estimating causal effects of treatments in randomized and nonrandomized studies.

Randomized assignment yields unbiased estimates of the average causal effect because treatment is independent of potential outcomes, eliminating confounding.

High confidencePublished

Potential outcomes framework positive Ability to define and identify causal effects

The potential outcomes framework provides a formal mathematical definition of causal effects as Y_i(1)-Y_i(0) and clarifies the fundamental problem of causal inference: only one potential outcome is observed per unit.

Population: All causal inference problems · Comparator: No formal causal framework

Primary evidence

Estimating causal effects of treatments in randomized and nonrandomized studies.

The potential outcomes framework provides a formal mathematical definition of causal effects as Y_i(1)-Y_i(0) and clarifies the fundamental problem of causal inference: only one potential outcome is observed per unit.

High confidencePublished

Instrumental variables estimation mixed Interpretability of IV estimate

The IV estimator identifies the average treatment effect only for compliers (LATE), not the entire population, under assumptions of independence, exclusion restriction, first stage, and monotonicity.

Population: Studies with treatment effect heterogeneity and a valid instrument · Comparator: Traditional interpretation as average treatment effect for entire population

Primary evidence

Identification of Causal effects Using Instrumental Variables

The IV estimator identifies the average treatment effect only for compliers (LATE), not the entire population, under assumptions of independence, exclusion restriction, first stage, and monotonicity.

High confidencePublished

Instrumental variables estimation with monotonicity assumption positive Identification of causal effect

Assuming no defiers (monotonicity) ensures the IV estimand equals the average treatment effect for compliers, providing a clear causal interpretation.

Population: Studies with a binary instrument and binary treatment · Comparator: IV estimation without monotonicity

Primary evidence

Identification of Causal effects Using Instrumental Variables

Assuming no defiers (monotonicity) ensures the IV estimand equals the average treatment effect for compliers, providing a clear causal interpretation.

High confidencePublished

Propensity score conditioning positive Bias removal for causal effect estimation

Conditioning on the propensity score is sufficient to remove all bias due to observed confounders under strong ignorability, reducing a high-dimensional adjustment problem to a one-dimensional one.

Population: Observational studies with binary treatment and observed confounders · Comparator: Conditioning on all individual covariates

Primary evidence

The central role of the propensity score in observational studies for causal effects

Conditioning on the propensity score is sufficient to remove all bias due to observed confounders under strong ignorability, reducing a high-dimensional adjustment problem to a one-dimensional one.

High confidencePublished

Propensity score as a balancing score positive Covariate balance between treated and untreated groups

At any given propensity score value, treated and untreated individuals have the same distribution of observed covariates, making the propensity score a balancing score.

Population: Observational studies with binary treatment · Comparator: No adjustment

Primary evidence

The central role of the propensity score in observational studies for causal effects

At any given propensity score value, treated and untreated individuals have the same distribution of observed covariates, making the propensity score a balancing score.

Evidence base

Min quality:

50 papers

StudyPreprintWikiCanonicalModerate

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

Stefan Wager, Susan Athey · 2015

Many scientific and engineering challenges -- ranging from personalized medicine to customized marketing recommendations -- require an understanding of treatment effect heterogeneity. In this paper, we develop a non-parametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect, and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.

Read the breakdown →
StudyPreprintWikiCanonicalModerate

Quasi-Oracle Estimation of Heterogeneous Treatment Effects

Xinkun Nie, Stefan Wager · 2017 · 840 citations

Flexible estimation of heterogeneous treatment effects lies at the heart of many statistical challenges, such as personalized medicine and optimal resource allocation. In this paper, we develop a general class of two-step algorithms for heterogeneous treatment effect estimation in observational studies. We first estimate marginal effects and treatment propensities in order to form an objective function that isolates the causal component of the signal. Then, we optimize this data-adaptive objective function. Our approach has several advantages over existing methods. From a practical perspective, our method is flexible and easy to use: In both steps, we can use any loss-minimization method, e.g., penalized regression, deep neural networks, or boosting; moreover, these methods can be fine-tuned by cross validation. Meanwhile, in the case of penalized kernel regression, we show that our method has a quasi-oracle property: Even if the pilot estimates for marginal effects and treatment propensities are not particularly accurate, we achieve the same error bounds as an oracle who has a priori knowledge of these two nuisance components. We implement variants of our approach based on penalized regression, kernel ridge regression, and boosting in a variety of simulation setups, and find promising performance relative to existing baselines.

Read the breakdown →
StudyPreprintWikiModerate

Neyman Jackknife: Design-Based Variance Estimation for Causal Inference under Interference

Bryan Park, Stefan Wager · 2026 · 0 citations

We propose a framework, the Neyman Jackknife, for conservative variance estimation in finite-population causal inference under interference. Our approach provides a general, flexible blueprint that enables conservative variance estimation whenever we are able to recompute our target estimator with some treatment assignments omitted. In classical settings, our approach recovers estimators closely related to the Neyman estimator under SUTVA and the Newey-West HAC variance estimator for time series. Numerical experiments suggest that our general-purpose framework yields variance estimators that can match or even surpass the performance of baselines that were purpose-built for specific applications.

Read the breakdown →
StudyPreprintWikiModerate

Causal Inference with Categorical Unobserved Confounder via Mixture Learning

Aytijhya Saha, Stephen Bates, Devavrat Shah · 2026

Unobserved confounding is a fundamental challenge for estimating causal effects. To address unobserved confounding, recent literature has turned to two different approaches -- proxy variables and the use of multiple treatments. The first approach, commonly referred to as proximal causal inference, requires proxies to be assigned to specific asymmetric roles: treatment-inducing proxies (negative control exposures), variables that act as common causes of the treatment and outcome, and outcome-inducing proxies (negative control outcomes). In practice, however, identifying variables that satisfy these asymmetric roles can be difficult depending on the application domain. The second approach, commonly referred to as the ``Deconfounder," deals with multiple conditionally independent treatments. There has been limited progress towards developing a consistent estimation method for this setting. As the primary contribution of this work, we establish that causal effects are identifiable in both settings when the unobserved confounder is categorical under suitable conditions. Our approach builds on a mixture learning perspective: we show that the underlying confounding structure can be recovered by identifying the corresponding mixture distribution. We propose an estimation procedure based on tensor decomposition, which allows consistent recovery of the latent structure and comes with non-asymptotic guarantees. Simulation studies and real data experiments demonstrate that the proposed method performs well even with limited data.

Read the breakdown →
StudyPreprintWikiModerate

Single World Intervention Graphs as Distributions: A Framework for Causal Identification

Christian Bartels · 2026

Causal inference seeks to estimate the effect of an intervention on an outcome using observed data, typically via Rubin's potential-outcome framework or Pearl's do-calculus. Following section 9 of Richardson and Robins (2013), this essay treats single-world intervention graphs (SWIGs) as representations of both the observed-data distribution and the interventional distribution, rather than as a bridge to potential outcomes. We demonstrate that this perspective provides a systematic way to derive identifying expressions for estimands defined by interventions on selected variables. Back-door derivations mirror those in existing literature, while front-door derivations offer a distinct pathway that extends more readily to complex settings. Conceptually, the method is simultaneously related to and distinct from Rubin's framework and Pearl's calculus.

Read the breakdown →
StudyPreprintWikiModerate

Identifying Interventional Joint Distributions via Extended Bridge Functions

Constantin Schott · 2026

Existing identification results in proximal causal inference often focus on marginal interventional distributions using standard outcome or treatment bridge functions. These methods do not generally identify joint interventional distributions that contain all proxy variables that were used to define the corresponding bridge functions. In many applications, however, these joint interventional distributions are a natural target of interest. We introduce extended bridge functions and derive new identification results for joint interventional distributions that may retain all relevant proxy variables. We then apply these results to proximal identification algorithms, where interventional kernels naturally arise as intermediate objects, yielding a generalized framework based on kernel operations.

Read the breakdown →
StudyPreprintWikiModerate

Local Covariate Selection for Average Causal Effect Estimation without Pretreatment and Causal Sufficiency Assumptions

Zeyu Liu, Zheng Li, Feng Xie +3 more · 2026

We study the problem of selecting covariates for unbiased estimation of the total causal effect.Existing approaches typically rely on global causal structure learning over all variables, or on strong assumptions such as causal sufficiency - where observed variables share no latent confounders - or the pretreatment assumption, which limits covariates to those unaffected by the treatment or outcome. These requirements are often unrealistic in practice, and global learning becomes computationally prohibitive in high-dimensional settings.To address these challenges, we propose a novel local learning method for covariate selection in nonparametric causal effect estimation that avoids both the pretreatment and causal sufficiency assumptions. We first characterize a local boundary that contains at least one valid adjustment set whenever one exists for identifying the causal effect, and then develop local identification procedures to efficiently search within this boundary.We prove that the proposed method is sound and complete. Experiments on multiple synthetic datasets and two real-world datasets show that our approach achieves accurate causal effect estimation while substantially improving computational efficiency.

Read the breakdown →
StudyPreprintWikiModerate

Causal Bias Detection in Generative Artificial Intelligence

Drago Plecko · 2026

Automated systems built on artificial intelligence (AI) are increasingly deployed across high-stakes domains, raising critical concerns about fairness and the perpetuation of demographic disparities that exist in the world. In this context, causal inference provides a principled framework for reasoning about fairness, as it links observed disparities to underlying mechanisms and aligns naturally with human intuition and legal notions of discrimination. Prior work on causal fairness primarily focuses on the standard machine learning setting, where a decision-maker constructs a single predictive mechanism $f_{\widehat Y}$ for an outcome variable $Y$, while inheriting the causal mechanisms of all other covariates from the real world. The generative AI setting, however, is markedly more complex: generative models can sample from arbitrary conditionals over any set of variables, implicitly constructing their own beliefs about all causal mechanisms rather than learning a single predictive function. This fundamental difference requires new developments in causal fairness methodology. We formalize the problem of causal fairness in generative AI and unify it with the standard ML setting under a common theoretical framework. We then derive new causal decomposition results that enable granular quantification of fairness impacts along both (a) different causal pathways and (b) the replacement of real-world mechanisms by the generative model's mechanisms. We establish identification conditions and introduce efficient estimators for causal quantities of interest, and demonstrate the value of our methodology by analyzing race and gender bias in large language models across different datasets.

Read the breakdown →
StudyPreprintWikiModerate

Causal EpiNets: Precision-corrected Bounds on Individual Treatment Effects using Epistemic Neural Networks

Gandharv Patil, Keyi Tang, Raquel Aoki +1 more · 2026

Individual treatment effects are not point-identified from data. The Probability of Necessity and Sufficiency (PNS) circumvents this limitation by characterizing individual-level causality through intersection bounds derived from combined experimental and observational data. In finite samples, however, standard plug-in estimators systematically fail: they violate structural probability constraints and suffer from extremum bias induced by max-min operators, yielding spuriously narrow intervals. We propose a neural framework for finite-sample PNS estimation that resolves both pathologies. We introduce an anchored neural architecture that guarantees structural constraint satisfaction by construction. To correct extremum bias, we employ precision-corrected intersection-bound inference, leveraging Epistemic Neural Networks for scalable, high-dimensional uncertainty quantification. Empirical evaluations confirm that this approach maintains nominal coverage and exact constraint validity in high-dimensional regimes where standard estimators systematically undercover.

Read the breakdown →
StudyPreprintWikiModerate

CausalGuard: Conformal Inference under Graph Uncertainty

Vikash Singh, Weicong Chen, Debargha Ganguly +12 more · 2026

Estimating treatment effects from observational data requires choosing an adjustment set, but valid adjustment depends on an unknown causal graph. Graph misspecification can cause under-coverage, while graph-agnostic conformal wrappers may regain nominal coverage only through large padding. We introduce CausalGuard, a structure-weighted conformal framework that calibrates after aggregating graph-conditional doubly robust pseudo-outcomes. Candidate DAGs are proposed from an LLM-derived edge prior, pruned by conditional-independence tests, and reweighted by Bayesian Information Criterion. A composite nonconformity score then calibrates the posterior-weighted pseudo-outcome. CausalGuard provides distribution-free finite-sample marginal coverage for this aggregated pseudo-outcome; under causal identification, overlap, conditional-mean nuisance stability, and concentration on target-aligned valid adjustment strategies, its conditional mean converges to the true Conditional Average Treatment Effect. Across five benchmarks, CausalGuard attains mean coverage above the nominal 90% level for the directly evaluable target and reduces width when graph-agnostic conformal baselines require large padding. Stress tests show that CausalGuard suppresses invalid collider adjustment and remains stable under misspecified priors when the retained candidate set is data-supported.

Read the breakdown →
StudyPreprintWikiModerate

Individualized Causal Effects under Network Interference with Combinatorial Treatments

Yunping Lu, Haoang Chi, Qirui Hu +1 more · 2026

Modern causal decision-making increasingly demands individualized treatment-effect estimation in networks where interventions are high-dimensional, combinatorial vectors. While network interference, effect heterogeneity, and multi-dimensional treatments have been studied separately, their intersection yields an exponentially large intervention space that makes standard identification tools and low-dimensional exposure mappings untenable. We bridge this gap with a unified framework that constructs a \emph{global potential-outcome emulator} for unit-level inference. Our method combines (1) rooted network configurations to leverage local smoothness, (2) doubly robust orthogonalization to mitigate confounding from network position and covariates, and (3) sparse spectral learning to efficiently estimate response surfaces over the $2^p$-dimensional treatment space. We also decompose networked effects into own-treatment, structural, and interaction components, and provide finite-sample error bounds and asymptotic consistency guarantees. Overall, we show that individualized causal inference remains feasible in high-dimensional networked settings without collapsing the intervention space.

Read the breakdown →
StudyPreprintWikiModerate

Linear models for causal inference under network interference

Eric Tong, Salvador V. Balkus · 2026 · 0 citations

In causal inference, interference occurs when the treatment of one unit may affect the outcomes of other units. The goal of this work is to serve as a guide to the use of linear outcome modeling for estimating causal effects in settings where interference may pose a challenge to identification and estimation, such as spatial and network data. We demonstrate that, under a linear model, causal effects of binary and continuous treatments can be identified in terms of regression coefficients under totally and partially known interference structures. Our work constructs unbiased and consistent point and variance estimators for these effects under one or more possible fixed or random interference networks. A chief advantage is that this approach can be implemented using standard linear regression software, and is easily augmented with random effects and heteroscedastic or autocorrelation consistent standard errors. Numerical experiments and an example data analysis demonstrate the efficacy of this approach in eliminating interference bias.

Read the breakdown →
StudyPreprintWikiModerate

Sensitivity analysis for causal mediation: bridge score, sharp sensitivity bounds, and calibration

Yuki Ohnishi, Fan Li · 2026

Causal mediation analysis decomposes the total treatment effect into a portion operating through a hypothesized mediator and a residual direct portion. Identification of natural direct and indirect effects typically rests on the mediator stage of sequential ignorability, which cannot be empirically verified and requires explicit sensitivity analysis. We introduce the \emph{bridge score}, a low-dimensional vector formed from the two treatment-specific mediator densities at a common mediator value, and show that it is a balancing score for the mediator stage of sequential ignorability. Conditional on the bridge score, we then derive a sharp pointwise envelope on the unidentified mediator-outcome confounding function in terms of two interpretable latent confounding parameters. To make the bound operational for sensitivity analysis, we further introduce two calibration approaches. The first is benchmark calibration against an observed covariate, including a rank-based version that is invariant to monotone re-expressions of the benchmark; the second is residual budget calibration based on residual outcome variation. Finally, we show how the pointwise bound can be operationalized for inference through a scalar functional reduction and a Bayesian g-computation algorithm that propagates all sources of uncertainty into posterior draws of the mediation effect estimates.

Read the breakdown →
StudyPreprintWikiModerate

Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance

Gabriel Okasa · 2022 · 14 citations

Estimation of causal effects using machine learning methods has become an active research field in econometrics. In this paper, we study the finite sample performance of meta-learners for estimation of heterogeneous treatment effects under the usage of sample-splitting and cross-fitting to reduce the overfitting bias. In both synthetic and semi-synthetic simulations we find that the performance of the meta-learners in finite samples greatly depends on the estimation procedure. The results imply that sample-splitting and cross-fitting are beneficial in large samples for bias reduction and efficiency of the meta-learners, respectively, whereas full-sample estimation is preferable in small samples. Furthermore, we derive practical recommendations for application of specific meta-learners in empirical studies depending on particular data characteristics such as treatment shares and sample size.

Read the breakdown →
StudyPreprintWikiModerate

Causal Discovery in Structural VAR Models Under Equal Noise Variance

SeyedSina Seyedi HasanAbadi, Fahimeh Arab, Erfan Nozari +1 more · 2026 · 5 citations

Causal discovery from multivariate time series is challenging when causal effects may occur both across time and within the same sampling interval. This issue is especially important in applications such as neuroscience, where the sampling rate may be coarse relative to the underlying dynamics and contemporaneous effects need not form an acyclic graph. We study causal discovery in linear Gaussian structural VAR models under an equal noise variance assumption, meaning that the structural noise terms have a common variance. Unlike the DAG-based cross-sectional equal noise variance setting, the time-series setting considered here does not generally yield point identification of a unique causal graph. Instead, multiple structural VAR parameterizations can induce the same stationary observed process law. We introduce a notion of observational equivalence tailored to this setting and show that the corresponding equivalence class is characterized by orthogonal transformations of the structural equations together with a global positive scale. This characterization leads to an equivalence-aware model discrepancy, the observational alignment discrepancy, which compares structural models modulo transformations that preserve the observed law. Building on this theory, we propose ENVAR, a sparsity-based procedure that searches over the induced observational equivalence class for a sparse normalized structural representative. We evaluate the proposed methodology on synthetic structural VAR data and on an fMRI dataset.

Read the breakdown →
StudyPreprintWikiModerate

Stable Causal Discovery via Directed Acyclic Graph Aggregation

Yunan Wu, Yue Wang, Chunlin Li +1 more · 2026

Directed Acyclic Graphs (DAGs) are central to uncovering causal structure in complex systems, yet learning a single DAG from data is often challenging: model uncertainty, finite samples, and a combinatorially large search space frequently yield unstable estimates. We propose DAGgr, a model averaging framework that aggregates multiple candidate DAGs into a single stable representation. Candidate graphs are weighted by their out-of-sample predictive likelihood across repeated data splits, and a thresholding rule on the resulting edge-importance scores guarantees that the aggregated graph is itself acyclic. We establish a finite-sample risk bound, prove that the procedure preserves acyclicity, and show that edge selection is consistent under mild conditions on the weights. Simulations across random, hub, and chain structures, together with an analysis of the Sachs et al. (2005) protein-signaling network, show that DAGgr matches or exceeds the best individual candidate while consistently outperforming bootstrap-aggregation baselines across structural recovery metrics.

Read the breakdown →
StudyPreprintWikiModerate

Causal Identification under Interference: The Role of Treatment Assignment Independence

Julius Owusu, Monika Avila Márquez · 2026 · 0 citations

Empirical researchers routinely invoke the no-interference or \textit{individualistic treatment response} (ITR) assumption to identify causal effects in observational studies, despite concerns that interference across units may arise in many economic settings. This paper studies the causal content of standard ITR-based identification formulas when arbitrary interference is present. We show that, under restrictions on dependence between treatment assignments across units, conventional ITR-based identification formulas -- including those underlying selection-on-observables, instrumental variables, regression discontinuity designs, and difference-in-differences -- identify well-defined causal objects: types of \textit{average direct effects} (ADEs). These results do not require knowledge of the interference structure or specification of exposure mappings. We also propose a sensitivity analysis framework that quantifies the robustness of statistical inference to violations of treatment-assignment independence under arbitrary interference.

Read the breakdown →
StudyPreprintWikiModerate

Evaluating causal indirect effects when mediators are left-censored by assay limit of quantification

Cong Jiang, Michael D. Hughes, Nima S. Hejazi · 2026

Causal mediation analysis is essential for disentangling the mechanisms by which investigational therapeutic and preventive agents impact clinical outcomes. However, the measurement of biological mediators is often subject to left-censoring by technical measurement limitations, most commonly an assay's limit of quantification. This form of censoring can pose severe challenges for both identification and estimation of causal mediation estimands, particularly when the censoring mechanism is deterministic and the resulting missingness is missing not at random (MNAR) or nonignorable. Motivated by the question of assessing the role of viral RNA in the action mechanism of monoclonal antibody therapies for COVID-19 in the Accelerating COVID-19 Therapeutics and Vaccine (ACTIV)-2 platform trial, we develop a semi-parametric framework for estimation of the natural direct and indirect effects when the mediator of interest is partially subject to this form of left-censoring. Our proposed strategy combines fractional imputation with a semi-parametric EM algorithm to flexibly estimate key components of the factorized data likelihood. Applying the proposed strategy to circumvent the left-censoring, we discuss both traditional plug-in and asymptotically efficient estimators of the direct and indirect effect estimands, introducing a data-adaptive $m$-out-of-$n$ bootstrap for robust inference under the imputation procedure. We demonstrate in numerical experiments that our approach significantly reduces bias and allows for reliable inference. An application to data from the ACTIV-2 platform trial confirms that monoclonal antibody therapies reduce the risk of hospitalization and death due to COVID-19, while suggesting that changes in viral RNA mediate only a modest proportion of the overall treatment effect.

Read the breakdown →
StudyPreprintWikiModerate

Disentangling spatial interference and spatial confounding biases in causal inference

Isqeel Ogunsola, Olatunji Johnson · 2026

Spatial interference and spatial confounding are two major issues inhibiting precise causal estimates when dealing with observational spatial data. Moreover, the definition and interpretation of spatial confounding remain arguable in the literature. In this paper, our goal is to provide clarity in a novel way on misconception and issues around spatial confounding from Directed Acyclic Graph (DAG) perspective and to disentangle both direct, indirect spatial confounding and spatial interference based on bias induced on causal estimates. Also, existing analyses of spatial confounding bias typically rely on Normality assumptions for treatments and confounders, assumptions that are often violated in practice. Relaxing these assumptions, we derive analytical expressions for spatial confounding bias under more general distributional settings using Poisson as example . We showed that the choice of spatial weights, the distribution of the treatment, and the magnitude of interference critically determine the extent of bias due to spatial interference. We further demonstrate that direct and indirect spatial confounding can be disentangled, with both the weight matrix and the nature of exposure playing central roles in determining the magnitude of indirect bias. Theoretical results are supported by simulation studies and an application to real-world spatial data. In future, parametric frameworks for concomitantly adjusting for spatial interference, direct and indirect spatial confounding for both direct and mediated effects estimation will be developed.

Read the breakdown →
ObservationalPreprintWikiModerate

Nonparametric efficient inference for network quantile causal effects under partial interference

Chao Cheng, Fan Li · 2026

Interference arises when the treatment assigned to one individual affects the outcomes of other individuals. Commonly, individuals are naturally grouped into clusters, and interference occurs only among individuals within the same cluster, a setting referred to as partial interference. We study network causal effects on outcome quantiles in the presence of partial interference. We develop a general nonparametric efficiency theory for estimating these network quantile causal effects, which leads to a nonparametrically efficient estimator. The proposed estimator is consistent and asymptotically normal with parametric convergence rates, while allowing for flexible, data-adaptive estimation of complex nuisance functions. We leverage a three-way cross-fitting procedure that avoids direct estimation of the conditional outcome distribution. Simulations demonstrate adequate finite-sample performance of the proposed estimators, and we apply the methods to a clustered observational study.

Read the breakdown →
StudyPreprintWikiModerate

Missing data and cluster graphs: cluster-level missingness vs variable-level missingness

Willow Scott, Eugenio Valdano, Charles Assaad · 2026

Missing data is pervasive in many scientific domains such as public health, environmental science, and the social sciences. Recoverability from missing data is typically studied using fully specified variable-level missingness models despite that, in many applications, only coarse structural information is available, for instance when variables are grouped into clusters due to limited knowledge or interpretability reasons. In this paper, we investigate recoverability from such abstract representations. We introduce two classes of cluster-based missingness graphs: the m-C-DMG, which retains variable-specific missingness indicators, and the cm-C-DMG, which aggregates missingness mechanisms at the cluster level. We formalize the notion of compatibility between these abstract graphs and underlying variable-level missingness models, and study how this abstraction affects the recoverability of probabilistic and causal queries. In particular, we give graphical conditions of recovering the joint distribution as well as graphical conditions of recovering a macro causal effect. Overall, our results clarify when cluster-level missingness information is sufficient for valid inference, and when finer-grained modeling is necessary.

Read the breakdown →
StudyPreprintWikiModerate

Causal treatment effect decompositions with time-to-event outcomes under competing events

Mikko Valtanen, Tommi Härkänen, Jenni Lehtisalo +3 more · 2026

Inference about treatment effects for time-to-event outcomes is often obscured by the presence of competing events. A particularly complex situation arises when the treatment influences the occurrence of the competing event. A comprehensive assessment should then account for different mechanisms by which the treatment and the competing event together produce the apparent treatment effect. Here, we propose a decomposition of the treatment's effect on the event of interest (target), characterising how it arises due to four distinct mechanisms involving both the target and competing events. Based on a causal model, the decomposition relies on cross-world estimands reflecting counterfactual scenarios in which the treatment affects the two events as if set to conflicting levels. We specify exchangeability and consistency assumptions under which the decomposition can be estimated from observed data. We discuss how the new decomposition reveals the role of the competing event and serves as a basis for defining causally interpretable estimands in the presence of competing events. Finally, we demonstrate the use of the four-way decomposition with datasets from two randomised trials.

Read the breakdown →
StudyPreprintWikiModerate

Leveraging heterogeneity for identifiability: Bayesian order-based learning of multiple DAGs

Hyunwoong Chang, Fariha Taskin · 2026

We propose a joint order-based scoring framework for causal structure learning of directed acyclic graph (DAG) models under heterogeneous data settings. We show that leveraging heterogeneity improves the accuracy of causal ordering estimation. In the most favorable case, the causal ordering is identifiable up to two permutations. Building on this framework, we propose an order-based Bayesian method for Gaussian DAG models and establish its theoretical properties in the high-dimensional regime. For posterior inference over the space of orderings, we introduce a random-to-random (R2R) proposal neighborhood for the Metropolis-Hastings algorithm, which is theoretically motivated and exhibits efficient mixing behavior. Simulation studies confirm the strong empirical performance of the proposed method, and an application to single-nucleus RNA sequencing data from major depressive disorder demonstrates practical utility.

Read the breakdown →
StudyPreprintWikiModerate

Conformal Convolution and Monte Carlo Meta-learners for Predictive Inference of Individual Treatment Effects

Jef Jonkers, Jarne Verhaeghe, Glenn Van Wallendael +2 more · 2024 · 6 citations

Generating probabilistic forecasts of potential outcomes and individual treatment effects (ITE) is essential for risk-aware decision-making in domains such as healthcare, policy, marketing, and finance. We propose two novel methods: the conformal convolution T-learner (CCT) and the conformal Monte Carlo (CMC) meta-learner, that generate full predictive distributions of both potential outcomes and ITEs. Our approaches combine weighted conformal predictive systems with either analytic convolution of potential outcome distributions or Monte Carlo sampling, addressing covariate shift through propensity score weighting. In contrast to other approaches that allow the generation of potential outcome predictive distributions, our approaches are model agnostic, universal, and come with finite-sample guarantees of probabilistic calibration under knowledge of the propensity score. Regarding estimating the ITE distribution, we formally characterize how assumptions about potential outcomes' noise dependency impact distribution validity and establish universal consistency under independence noise assumptions. Experiments on synthetic and semi-synthetic datasets demonstrate that the proposed methods achieve probabilistically calibrated predictive distributions while maintaining narrow prediction intervals and having performant continuous ranked probability scores. Besides probabilistic forecasting performance, we observe significant efficiency gains for the CCT- and CMC meta-learners compared to other conformal approaches that produce prediction intervals for ITE with coverage guarantees.

Read the breakdown →
StudyPreprintWikiModerate

Comparison of meta-learners for estimating multi-valued treatment heterogeneous effects

Naoufal Acharki, Ramiro Lugo, Antoine Bertoncello +1 more · 2022 · 18 citations

Conditional Average Treatment Effects (CATE) estimation is one of the main challenges in causal inference with observational data. In addition to Machine Learning based-models, nonparametric estimators called meta-learners have been developed to estimate the CATE with the main advantage of not restraining the estimation to a specific supervised learning method. This task becomes, however, more complicated when the treatment is not binary as some limitations of the naive extensions emerge. This paper looks into meta-learners for estimating the heterogeneous effects of multi-valued treatments. We consider different meta-learners, and we carry out a theoretical analysis of their error upper bounds as functions of important parameters such as the number of treatment levels, showing that the naive extensions do not always provide satisfactory results. We introduce and discuss meta-learners that perform well as the number of treatments increases. We empirically confirm the strengths and weaknesses of those methods with synthetic and semi-synthetic datasets.

Read the breakdown →
StudyPreprintWikiModerate

Positive-definiteness in separable priors: effects on prior interpretability and inference

Jack Storror Carter, David Rossell · 2026

A popular class of priors for symmetric positive-definite matrices assumes independent entries and adds a truncation to ensure positive-definiteness. While conceptually simple and often computationally convenient, unless done carefully this truncation can have unintended effects. If the truncated prior or its margins are significantly different from their untruncated counterpart, then its interpretability may suffer, its shrinkage properties become harder to characterise, and posterior inference may be affected in unanticipated ways. We investigate the effect of the truncation both for dense and sparse matrices, and show how to set prior parameters such as the variance of off-diagonal entries such that said effect is mitigated as the matrix dimension grows. We pay particular attention to sparse inference where, unless prior parameters are set carefully, the truncated prior and hence its corresponding posterior assign systematically higher mass to sparser structures than the untruncated prior.

Read the breakdown →
StudyPreprintWikiModerate

A formal approach to variable selection in difference-in-differences

Daniela Rodrigues, Laura A. Hatfield · 2026 · 0 citations

Difference-in-differences (DiD) identification relies mainly on a parallel trends assumption about untreated potential outcomes. Researchers often relax this assumption by assuming conditional parallel trends within units with the same covariate values. However, the process of selecting which covariates to include in this assumption is often \emph{ad hoc}. We propose a formal approach to select the variables that support conditional parallel trends based on graphical criteria. We show that the parallel trends assumption is rarely justified without conditioning on covariates, and that unconditional and conditional parallel trends can conflict with one another. We also demonstrate that a time-invariant covariate with a time-invariant effect on the outcome, which might not ordinarily be considered a confounder in DiD, may be a useful conditioning variable. We clarify that adjustment for a post-treatment covariate depends on what causes that covariate to change. Extending our framework to multiple time periods, we distinguish between treatment type and rollout strategy and examine the problem of treatment-confounder feedback. On the estimation side, we argue that the difficulty of incorporating covariates in DiD, often framed as an estimator problem, is more accurately understood as a misalignment between the adjustment set used by the estimator and the adjustment set required for identification. This misalignment affects several popular estimation procedures, and resolving it requires not a change of estimator, but a change in how covariates enter the estimation procedure. We show how to achieve this alignment for all estimators we evaluate.

Read the breakdown →
StudyPreprintWikiModerate

External Validity: From Do-Calculus to Transportability Across Populations

Judea Pearl, Elias Bareinboim · 2015 · 374 citations

The generalizability of empirical findings to new environments, settings or populations, often called "external validity," is essential in most scientific explorations. This paper treats a particular problem of generalizability, called "transportability," defined as a license to transfer causal effects learned in experimental studies to a new population, in which only observational studies can be conducted. We introduce a formal representation called "selection diagrams" for expressing knowledge about differences and commonalities between populations of interest and, using this representation, we reduce questions of transportability to symbolic derivations in the do-calculus. This reduction yields graph-based procedures for deciding, prior to observing any data, whether causal effects in the target population can be inferred from experimental findings in the study population. When the answer is affirmative, the procedures identify what experimental and observational findings need be obtained from the two populations, and how they can be combined to ensure bias-free transport.

Read the breakdown →
StudyPreprintWikiModerate

Low-rank Covariate Balancing Estimators under Interference

Souhardya Sengupta, Kosuke Imai, Georgia Papadogeorgou · 2025 · 1 citations

A key methodological challenge in observational studies with interference between units is twofold: (1) each unit's outcome may depend on many others' treatments, and (2) treatment assignments may exhibit complex dependencies across units. We develop a general statistical framework for constructing robust causal effect estimators to address these challenges. We first show that, without restricting the patterns of interference, the standard inverse probability weighting (IPW) estimator is the only uniformly unbiased estimator when the propensity score is known. In contrast, no estimator has such a property if the propensity score is unknown. We then introduce a \emph{low-rank structure} of potential outcomes as a broad class of structural assumptions about interference. This framework encompasses common assumptions such as anonymous, nearest-neighbor, and additive interference, while flexibly allowing for more complex study-specific interference assumptions. Under this low-rank assumption, we show how to construct an unbiased weighting estimator for a large class of causal estimands. The proposed weighting estimator does not require knowledge of true propensity scores and is therefore robust to unknown treatment assignment dependencies that often exist in observational studies. If the true propensity score is known, we can obtain an unbiased estimator that is more efficient than the IPW estimator by leveraging a low-rank structure. We establish the finite sample and asymptotic properties of the proposed weighting estimator, develop a data-driven procedure to select among candidate low-rank structures, and validate our approach through simulation and empirical studies.

Read the breakdown →
StudyPreprintWikiModerate

Prediction-Intervention Games and Invariant Sets

Linus Kühne, Felix Schur, Jonas Peters · 2026

We consider the following two-player game: using observational data, the leader chooses a prediction function for a response variable $Y$ from given covariates. The follower then reacts with an intervention on some covariates in the underlying structural causal model to maximize their own objective. The leader knows the intervention targets, but may have limited knowledge of the follower's objective. We call this setup a prediction-intervention game, a special case of a Stackelberg game. Finding an optimal strategy for the leader is generally difficult. To avoid severe performance loss, the leader may base their prediction on the causal parents of $Y$, or more generally on an invariant subset of covariates. We prove, for two common classes of follower objectives, that predictors based on the stable blanket, a specific invariant subset, are always better or as good as those based on the causal parents. We further upper bound the leader's post-intervention risk by a worst-case risk over allowed interventions and strengthen existing distribution generalization results to analyze this bound: we give sufficient conditions under which stable-blanket predictors are worst-case optimal, and show by examples that these conditions cannot in general be dropped. Finally, we discuss practical strategies for settings with known and unknown graph, and test them on simulated and real-world data.

Read the breakdown →
StudyPreprintWikiModerate

Inference for Fréchet Regression

Wookyeong Song, Paromita Dubey, Hans-Georg Müller +1 more · 2026

Linear regression is widely used to model relationships between responses and predictors. In modern applications, one encounters data where the responses are non-Euclidean random objects situated in a metric space, paired with Euclidean predictors. Global Fréchet regression generalizes linear regression to such general settings, however statistical inference has remained largely unexplored. We develop a significance test for the null hypothesis that the Fréchet regression function does not depend on the predictors, addressing the challenge of an absence of linear operations in metric spaces. We also develop a test for the partial effect of a subset of the predictors in analogy to, but quite different from, the partial F-tests commonly used in classical linear regression under Gaussian assumptions. Key ideas are to employ random multipliers to obtain non-degenerate null distributions for the proposed test statistics and the Cauchy combination method. We obtain consistency and convergence results under the null hypothesis and contiguous alternatives and demonstrate the finite sample performance of the proposed tests through simulations on network data represented by graph Laplacians and spherical data with geodesic distances. We further illustrate our method using transport networks arising from New York City taxi trip data and U.S. energy source compositional data.

Read the breakdown →
StudyPreprintWikiModerate

Selecting Informative Conformal Prediction Sets with an Optimized FCR-Controlled Approach

Israela Solomon, Etienne Roquain, Saharon Rosset +1 more · 2026

Conformal methods provide prediction sets for outcomes with confidence guarantees. We study their use in a selective inference setting, where inference is performed only when the prediction set is informative. The analyst may consider as informative, for example, cases with prediction sets that are sufficiently small, exclude null values, or satisfy other appropriate monotone constraints. Because inference is typically restricted to informative cases in practical applications, accounting for the resulting selection bias is crucial to maintaining false coverage rate (FCR) control. A general framework for constructing such informative conformal prediction sets while controlling the FCR on the selected sample was suggested in Gazin et al. (2025). In this work we focus on oracle-guided procedures. We derive the optimal decision policy under a suitable power objective in the oracle setting where the probability of belonging to each prediction set can be computed. In practice, of course, only estimated probabilities are available. We therefore introduce a calibration procedure that adjusts the oracle policy to maintain finite sample FCR control. We show that this approach can achieve substantially higher power than available alternatives. We demonstrate the effectiveness of our new methods for classification outcomes on both real and simulated data.

Read the breakdown →
StudyPreprintWikiModerate

The Partial Testimony of Logs: Evaluation of Language Model Generation under Confounded Model Choice

Jikai Jin, Vasilis Syrgkanis · 2026 · 0 citations

Offline evaluation of language models from usage logs is biased when model choice is confounded: the same user-side factors that influence which model is used can also influence how its output is judged, so raw comparisons of logged scores mix self-selected populations rather than estimating a common quantity of interest. A small randomized experiment can break this bias by overriding model choice, but in practice such experiments are scarce and costly. We study a three-source design that combines a large confounded observational log (OBS) for scale, a small randomized experiment (EXP) for unconfounded scoring, and an offline simulator (SIM) that replays candidate models on cached contexts. Our main result is an identification theorem showing that the randomized experiment and the simulator are together enough to recover causal model values; the observational log enters only afterward, to reduce estimation error rather than to make the causal comparison valid. Six estimator families are evaluated in a controlled semi-synthetic validation and in two real-task cached benchmarks for summarization and coding. No family dominates every regime; relative performance depends on the amount of unbiased EXP supervision and on how closely the target reward aligns with OBS-derived structure.

Read the breakdown →
ObservationalPreprintWikiModerate

Targeted maximum likelihood estimation of vaccine effectiveness and immune correlates in test-negative design studies with missing data

Leah I. B. Andrews, Lars van der Laan, Peter B. Gilbert · 2026

The test-negative design (TND) is a resource-efficient observational study design that can assess vaccine effectiveness and exposure-proximal immune correlates of disease. The TND enrolls symptomatic individuals seeking diagnostic testing and compares case status by an exposure variable, such as vaccination status or immune marker level, that is measured at testing. While the TND reduces confounding by healthcare-seeking behavior, other sources of confounding may remain. TND studies may also have missing data in the exposure variable due to incomplete records or two-phase sampling designs. We present a targeted maximum likelihood estimation approach involving a semiparametric logistic regression model that targets a causal conditional risk ratio of symptomatic disease in the healthcare-seeking population. Under causal and missing at random assumptions, our method produces an efficient, asymptotically linear estimator that provides flexible, data-driven confounding control and valid causal inference when analyzing TND studies with missing exposure variable data. We evaluate our method's finite sample properties using plasmode simulations of a two-phase TND immune correlates study. We also apply our method to assess COVID-19 vaccine effectiveness and antibody marker correlates of COVID-19 from TND study cohorts derived from the Moderna Coronavirus Efficacy phase 3 trial.

Read the breakdown →
StudyPreprintWikiModerate

A Unified Framework for Structure-Aware Clustering and Heterogeneous Causal Graph Learning

Honglin Du, Muxuan Liang, Xiang Zhong · 2026 · 1 citations

In complex multivariate systems, interactions among variables are defined by dependency structures, often encoded as directed acyclic graphs ($\text{DAGs}$). However, dependency structures can vary across subjects, and ignoring this structural heterogeneity introduces bias and obscures subpopulation-specific dependencies. To address this, we propose Directed Acyclic Graph-based Dependency Clustering via Alternating Direction Method of Multipliers (DAG-DC-ADMM), a unified framework built upon Structural Equation Modeling (SEM) that jointly learns cluster assignments and cluster-specific dependency structures. We encode acyclicity via a smooth constraint and integrate a groupwise truncated Lasso fusion penalty (gTLP) to cluster subjects based on their structural similarity. This yields a nonconvex optimization problem that incorporates sparsity, acyclicity, and structural consensus constraints. We address the nonconvexity by using the augmented Lagrangian method and solve it with an adapted version of the Alternating Direction Method of Multipliers (ADMM) for difference-of-convex programs. For certain graph structures, such as upper triangular adjacency matrices, our algorithm is guaranteed to converge to a Karush-Kuhn-Tucker (KKT) point. Experiments demonstrate that our method recovers cluster-specific causal dependency structures with a high true positive rate and a low false discovery rate. This capability enables the robust discovery of heterogeneous dependencies across subjects where the subpopulation label is unknown.

Read the breakdown →
StudyPreprintWikiModerate

Causal State-Dependent Local Projections

Joel M. David, Raffaella Giacomini, Xiyu Jiao +1 more · 2026

State-dependent local projections (LPs) are widely used to estimate how impulse responses to exogenous aggregate shocks vary as a function of observable state variables, yet their causal interpretation remains unclear. We show that LPs recover causal impulse responses under the sufficient condition that the conditional mean is linear in the aggregate shock at each horizon, and that this condition holds in a broad class of canonical micro-macro environments, including first-order perturbation solutions of heterogeneous-agent macro and macro-finance models. We further show that the commonly used linear interaction LPs generally fail to recover causal objects. We therefore develop a sieve-based LP estimator that recovers the causal responses and delivers valid pointwise and uniform inference in micro-macro panels. Empirically, allowing for flexible state dependence materially changes both the pattern of heterogeneous firm investment responses and their aggregate implications for the transmission of monetary policy shocks. Our findings thus place state-dependent LPs on firmer causal footing in micro-macro settings than in purely aggregate ones, provided state dependence is estimated nonparametrically.

Read the breakdown →
StudyPreprintWikiModerate

Relaxation of Projected Prior with Continuous Gap Shrinkage

Leo L Duan, Sunghyun Cho, Mingzhang Yin · 2026

Projected priors were originally introduced to accommodate parameter constraints, but have recently regained popularity due to their ability to assign probability mass to low-dimensional parameter sets, such as the spaces of sparse vectors, directed acyclic graphs, or transport plans. When employed as a transformation of random variables, projection is especially useful, since its contraction property not only preserves probability concentration, but also often preserves differentiability for gradient-based posterior computation. On the other hand, unless the projection can be obtained by some non-iterative algorithm, posterior computation can be expensive because it requires nesting an iterative optimization routine within each Markov chain Monte Carlo iteration. In this article, inspired by the success of continuous shrinkage models as replacements for discrete spike-and-slab priors, we propose a continuous relaxation of projected priors. The key idea is to quantify the duality gap between the primal projection loss and the dual objective, and impose a probabilistic prior that shrinks this gap toward zero. The resulting gap-shrinkage prior has a tractable form, does not require running an optimization subroutine inside each posterior update, and puts probability mass near the exact projection. We demonstrate useful properties of gap-shrinkage priors, including connections to global-local shrinkage priors, broad applicability to generalized projection functions, and competitive performance in posterior contraction. We apply the gap-shrinkage model to a marketing data analysis aimed at identifying important predictor effects on multivariate grocery-shopping decisions.

Read the breakdown →
StudyPreprintWikiModerate

Learning Gaussian Graphical Models under Total Positivity via Spectral Graph Sparsification

Ignacio Echave-Sustaeta Rodríguez, Aida Abiad, Frank Röttger · 2026 · 35 citations

Many practical data analysis tasks reduce to learning, from observed samples, how a collection of variables depend on each other. A widely used approach is to fit a Gaussian graphical model, which represents the dependence structure as a graph connecting the variables. In a number of important applications, such as financial returns, gene co-expression, and climate or network analysis, the dependencies tend to be positive: variables move together rather than offset each other. Encoding this positivity through the constraint of multivariate total positivity of order two (MTP2) yields an attractive estimator that produces accurate fits with no tuning required. The resulting graphs are, however, typically much denser than the underlying ground-truth model, which makes them hard to interpret and slow to use in any downstream task that operates on the graph. In this work, we propose a novel highly-scalable approach for learning Gaussian graphical models from data using spectral sparsification; we call it Spectral-MTP2. Spectral graph sparsification is a fundamental method which aims to preserve meaningful properties of a dense graph with a sparser subgraph. We theoretically and empirically investigate and validate our method, and show that learning Gaussian Graphical Models under MTP2 using spectral sparsification preserves MTP2 and approximates well the original model in terms of Kullback-Leibler divergence and Gaussian log-likelihood. In simulations and applications to equity returns and gene expression, we observe that Spectral-MTP2 retains most of the fit quality of the denser MTP2 baseline, while producing substantially sparser and more interpretable graphs.

Read the breakdown →
StudyPreprintWikiModerate

Application of Propensity Score Models and Causal Estimators in Observational Studies under Model Misspecification

Apu Chandra Das, Sakib Salam, Md Robiul Islam Talukder +3 more · 2026

Propensity score (PS) methods are widely used in observational studies to reduce confounding and estimate causal treatment effects. However, the validity of PS-based causal estimators depends heavily on correct model specification, and model misspecification may lead to substantial bias and instability. In this study, we systematically evaluate the performance of commonly used causal estimators, including response surface modeling (RSM), inverse probability weighting (IPW), and augmented inverse probability weighting (AIPW), under varying levels of PS and outcome model misspecification. We compare classical logistic regression with several machine learning approaches for PS estimation, including random forests (RF), support vector machines (SVM), and linear discriminant analysis (LDA). Extensive simulation studies were conducted under multiple scenarios defined by combinations of correctly specified and misspecified PS and outcome models, varying sample sizes, and different covariate correlation structures. Estimator performance was assessed using bias, absolute bias, root mean squared error, empirical standard error, and confidence interval width. Results demonstrate that AIPW consistently provides robust and stable estimates across most scenarios due to its doubly robust property, whereas IPW is highly sensitive to PS misspecification and unstable PS estimates produced by flexible machine learning methods. RSM performs well only when the outcome model is correctly specified. Real-world applications using the ACTG175 clinical trial and the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset further illustrate the practical implications of estimator choice and PS modeling strategy. Overall, our findings highlight the importance of integrating flexible machine learning approaches within doubly robust frameworks to improve causal effect estimation in observational studies.

Read the breakdown →
StudyPreprintWikiModerate

Controlling False Discovery in Arbitrarily Structured Hypothesis Spaces via Reproducing Kernels

Binyamin Perets, Shie Mannor · 2026

Large-scale hypothesis testing is central to modern science, where controlling the False Discovery Rate (FDR) has become the standard approach to managing false positives across many simultaneous tests. Hypotheses rarely exist in isolation; they often exhibit structure through proximity, connectivity, or hierarchy. This structure represents both a challenge and an opportunity: while classical methods treat these dependencies as obstacles requiring conservative correction, leveraging them can substantially increase discovery power. Here, we reframe structured FDR control as a regularized learning problem. By optimizing within a suitable Reproducing Kernel Hilbert Space (RKHS), we introduce a framework that unifies continuous domains, graphs, and hierarchies under a single algorithm through kernel choice alone. This formulation enables smooth solutions in place of the piecewise-constant fits of prior methods, principled likelihood-based hyperparameter selection rather than heuristic tuning, and inference at unobserved locations which in turn supports sample-efficient experimental design. Building on this estimator, we provide two decision rules which we prove to control the FDR. We validate our method on two sources: spatial locations derived from high-dimensional real-world datasets, and a differential gene expression task utilizing protein-protein interaction graphs.

Read the breakdown →
RCTPreprintWikiModerate

Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making

Abhirami Pillai · 2026

Treatment allocation under budget constraints is a central challenge in digital advertising: advertisers must decide which users to show ads to while spending a limited budget wisely. The standard approach follows a two-stage offline pipeline - first collect historical data to estimate heterogeneous treatment effects (HTE), then solve a constrained optimization to allocate the budget. This works well with abundant data, but fails in cold-start settings such as new campaigns, new markets, or new customer segments where little historical data exists. We propose Budget-Constrained Causal Bandits (BCCB), an online framework that learns which users respond to ads while simultaneously spending the budget, making treatment decisions one user at a time. BCCB unifies three components into a single sequential process: learning individual-level ad effectiveness, exploring users whose response is uncertain, and pacing the budget over time. We evaluated on the Criteo Uplift dataset, a large-scale advertising dataset from a real randomized controlled trial. Our key finding is a data-efficiency crossover: offline methods require approximately 10,000 historical observations to produce reliable results, while BCCB operates effectively from the very first user. Furthermore, BCCB exhibits 3-5x lower performance variance between runs, making it more practical for real campaign planning. Among purely online methods, BCCB consistently outperforms standard Thompson Sampling, budgeted Thompson Sampling, and greedy HTE estimation across all budget levels tested.

Read the breakdown →
RCTPreprintWikiModerate

Assessing Estimate of CATE from Observational Data via an RCT Study

Bosen Cui, Yuhong Yang · 2026

Conditional average treatment effects (CATEs) are increasingly estimated from observational data and used to guide policy and individualized treatment decisions. Before such estimates can be trusted in practice, their predictive fitness needs to be assessed, yet observational data alone offer limited opportunities for doing so. We propose CATE Assessment via Fitness Evaluation (CAFE), a formal framework for directly assessing the goodness-of-fit of a CATE estimate learned from observational data, rather than the full underlying outcome model, using evidence from a randomized trial. CAFE partitions the trial covariate space according to estimated propensity scores (or the like) and compares observationally derived conditional treatment effects with group-level experimental averages. The framework accommodates a broad class of CATE learners, including parametric models and flexible machine learning methods such as causal forest and boosting. We establish theoretical guarantees under both the null and alternative hypotheses, and introduce a maximum-type extension to improve sensitivity to localized lack of fit. When both randomized trial and observational data are available, we further develop a two-stage procedure to detect the existence of unobserved confounders. Extensive numerical studies show the utility of the CAFE approach when assessing observational-derived CATE estimates.

Read the breakdown →
StudyPreprintWikiModerate

Difference-in-differences with a mediator

Yuhao Deng, Haoyu Wei, Zhongzhe Ouyang · 2026 · 0 citations

Causal mediation analysis is a powerful tool for disentangling the total effect of a treatment into its direct effect on the outcome and its indirect effect mediated through an intermediate variable. However, in observational studies, confounding between treatment and potential outcomes typically renders the total and natural effects non-identifiable. In this work, we advance mediation analysis within the difference-in-differences framework. Under a mediator-adjusted parallel trends assumption and additional conditions, we demonstrate that natural indirect, direct, and total effects are identifiable in the treated group. We further derive efficient influence functions for these estimands, enabling the construction of multiply robust and nonparametrically efficient estimators. We establish the asymptotic properties of these estimators. Applying our methodology to data from the Job Corps Study, we find that job training significantly increases both short-term and long-term earnings, after controlling for the indirect effect through the proportion of weeks employed.

Read the breakdown →
StudyPreprintWikiModerate

Conditioning Gaussian Processes on Almost Anything

Henry Moss, Lachlan Astfalck, Thomas Cowperthwaite +5 more · 2026

Gaussian processes (GPs) offer a principled probabilistic model over functions, but exact inference is restricted to the linear-Gaussian regime. We establish an explicit equivalence between GPs and a class of linear diffusion models, recasting predictive sampling as an ODE with closed-form Gaussian dynamics and a likelihood-dependent guidance term that admits a simple Monte Carlo approximation. In the linear-Gaussian setting, we recover standard GP conditioning exactly; beyond conjugacy, the same machinery handles any conditioning statement admitting point-wise likelihood evaluation -- including non-linear physics, and, for the first time, natural language via large language models. Whitening isolates the irreducible non-Gaussian dynamics, minimising Wasserstein-2 transport cost and eliminating numerical stiffness. The result is a general-purpose GP inference scheme requiring no bespoke derivations. Together, these results provide a general mechanism for incorporating the full richness of real-world knowledge as conditioning information, opening a new frontier for the probabilistic modelling of real-world problems.

Read the breakdown →
StudyPreprintWikiModerate

Substantive-Model-Compatible Multiple Imputation for Cox Regression with a Diverging Number of Covariates

Zhilin Zhang, Yi Li · 2026

Modern biomedical survival studies with high-dimensional genomic and clinical predictors are challenged by missing covariates. Existing methods conduct inference through penalization and debiasing when the number of covariates diverges with sample size, but they are typically developed with fully observed covariates. Conversely, substantive-model-compatible multiple imputation methods, particularly substantive-model-compatible fully conditional specification (SMC-FCS), provide principled handling of missing covariates while preserving compatibility with the Cox model, yet current methodology and theory remain largely restricted to fixed-dimensional settings. To address these limitations, we propose a semiparametric multiple imputation framework for inference in Cox regression with missing covariates of a diverging dimension. Missing covariates are imputed through a high-dimensional SMC-FCS procedure driven by Cox-model likelihood contributions, with rejection sampling used to enforce substantive-model compatibility and ridge-regularized posterior draws used to stabilize the imputation models. The algorithm stabilizes the Cox estimator through an imputation-regularized optimization iteration and then generates multiply imputed datasets from a stabilized chain. Inference for low-dimensional linear functionals or contrasts, $c^\top β$, is obtained by combining debiased estimators and within-imputation variance estimates through Rubin's rules. We establish consistency and asymptotic normality of the resulting pooled estimator under a diverging-dimensional regime. Simulation studies demonstrate favorable finite-sample performance, and an application to the Boston Lung Cancer Survival Cohort illustrates the practical utility of the proposed method for high-dimensional survival studies with incomplete covariates.

Read the breakdown →
StudyPreprintWikiModerate

Recent Advances in Causal Analysis of the Stochastic Frontier Model

Samuele Centorrino, Christopher F. Parmeter · 2026

Causal inference methods (instrumental variables, difference-in-differences, regression discontinuity, etc.) are primary tools used across many social science milieus. One area where their application has lagged however, is in the study of productivity and efficiency. A main reason for this is that the nature of the stochastic frontier model does not immediately lend itself to a causal framework when interest hinges on an error component of the model. This paper reviews the nascent literature on attempts to merge the stochastic frontier literature with causal inference methods. We discuss modeling approaches and empirical issues that are likely to be relevant for applied researchers in this area. This review shows how this model can be easily put within the confines of causal analysis, reviews existing work that has already made inroads in this area, addresses challenges that have yet to be met and discusses core findings.

Read the breakdown →
StudyPreprintWikiModerate

Chained Markov melding using divide and conquer sequential Monte Carlo

Yixuan Liu, Robert J. B. Goudie · 2026

Specifying a full Bayesian model that integrates multiple data sources can be challenging. One natural approach is to specify each individual model separately and join them afterwards. This is the approach adopted in Markov melding. However, when adjacent submodels share common quantities, as in chained Markov melding, posterior inference can be challenging for existing MCMC-based approaches. In this paper, we propose a new multi-stage sampler for chained Markov models involving an arbitrary number of submodels. The proposed sampler adopts a divide-and-conquer sequential Monte Carlo approach for the tree-structured model that fits naturally with the structure of chained Markov melding. The resulting multi-stage sampler provides a flexible alternative for sampling from complex joint models, as its separate sampling scheme for different submodels avoids the need for directly sampling from the full model. We demonstrate applications of the sampler through two examples. The first is a toy example involving 11 submodels of various types. The second example considers an ecologically integrated population model that combines multiple datasets to estimate immigration and reproduction rates.

Read the breakdown →
StudyPreprintWikiModerate

Inference for Linear Systems with Unknown Coefficients

Yuehao Bai, Kirill Ponomarev, Andres Santos +3 more · 2026

This paper considers the problem of testing whether there exists a solution satisfying certain non-negativity constraints to a linear system of equations. Importantly and in contrast to some prior work, we allow all parameters in the system of equations, including the slope coefficients, to be unknown. For this reason, we describe the linear system as having unknown (as opposed to known) coefficients. This hypothesis testing problem arises naturally when constructing confidence sets for possibly partially identified parameters in the analysis of nonparametric instrumental variables models, treatment effect models, and random coefficient models, among other settings. To rule out certain instances in which the testing problem is impossible, in the sense that the power of any test will be bounded by its size, we begin our analysis by characterizing the closure of the null hypothesis with respect to the total variation distance. We then use this characterization to develop novel testing procedures based on sample-splitting. We establish the validity of our testing procedures under weak and interpretable conditions on the linear system. An important feature of these conditions is that they permit the dimensionality of the problem to grow rapidly with the sample size. A further attractive property of our tests is that they do not require simulation to compute suitable critical values. We illustrate the practical relevance of our theoretical results in a simulation study.

Read the breakdown →
StudyPreprintWikiModerate

Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference

Ziang Song, Ying Jin, Emmanuel J. Candès · 2026

Modern applications of conformal inference to multiple testing problems, such as outlier detection and candidate selection, often involve selecting test samples whose conformal p-values fall below a threshold. The quality of such methods is often measured by the false discovery proportion (FDP), defined as the fraction of incorrect selections. Existing approaches typically control the expected value of the FDP, using methods such as the Benjamini-Hochberg procedure. This approach fails to provide high-probability bounds on the realized false discovery proportion and invalidates statistical guarantees if the rejection threshold is selected after inspecting the data. This paper establishes finite-sample, distribution-free upper bounds on the FDP that hold simultaneously over all possible rejection thresholds, enabling arbitrary post hoc selection of the threshold. Simultaneous validity is achieved by constructing a high-probability envelope for the empirical distribution function of null conformal p-values by sampling from their joint distribution. Furthermore, our framework allows practitioners to modulate the envelope's shape, thereby producing tight bounds in rejection regions of primary interest. We use this flexible approach to derive simultaneous FDP upper bounds for both outlier detection and conformal selection. We demonstrate through synthetic and real-data experiments that the resulting bounds are both valid and substantially less conservative than those derived from existing approaches.

Read the breakdown →
StudyPreprintWikiModerate

A new class of functional conditional autoregressive models

Sooran Kim · 2026 · 1 citations

We introduce a new class of conditional autoregressive models for spatially dependent functional data, formulated through conditional means given neighboring functional observations and characterized by a covariance operator and a spatial dependence parameter. Our estimation strategy consists of three components: (i) estimating the covariance operator using conditionally centered data, (ii) estimating the spatial dependence parameter by maximizing the likelihood of projected observations, and (iii) applying a novel profile-based approach to obtain the final estimators. Under an expanding lattice framework, we establish two key theoretical results. First, we establish the consistency of the proposed covariance estimator, which is not attainable using naive methods based on marginally centered data. Second, we prove that the spatial dependence parameter estimator is superconsistent and asymptotically normal, where the latter property enables statistical inference for spatial dependence in functional data -- a contribution that is novel in the existing literature. Numerical studies support the theoretical results and demonstrate the computational efficiency of our method. Finally, we illustrate its practical utility by analyzing weekly PM$_{2.5}$ concentration trajectories in 2019 across counties in the Midwestern United States.

Read the breakdown →