Causal Estimation

Propensity scores, matching, IV, DID, RDD, synthetic control, and doubly-robust estimators.

Evidence briefs

Reviewed claims

Claim-level summaries connect a practical takeaway to the papers that actually support it.

High confidencePublished

Differences-in-differences (DiD) positive Bias in treatment effect estimates

DiD removes bias from time-invariant unobserved confounders by comparing changes over time between treated and control groups, whereas naive comparisons confound time trends with treatment effects.

Population: Panel data settings with a treated and untreated group · Comparator: Simple before-after or cross-sectional comparisons

Primary evidence

Mostly Harmless Econometrics: An Empiricist's Companion

DiD removes bias from time-invariant unobserved confounders by comparing changes over time between treated and control groups, whereas naive comparisons confound time trends with treatment effects.

High confidencePublished

Regression discontinuity (RD) positive Causal effect estimate validity

RD provides credible causal estimates by comparing outcomes just above and below the cutoff, mimicking a local randomized experiment, whereas naive comparisons suffer from confounding due to the assignment rule.

Population: Settings where treatment is assigned by a cutoff on a continuous variable · Comparator: Naive comparison of treated and untreated units far from cutoff

Primary evidence

Mostly Harmless Econometrics: An Empiricist's Companion

High confidencePublished

Difference-in-Differences with multiple time periods (new estimator) positive Bias in treatment effect estimates

The proposed estimator avoids bias from using already-treated units as controls, which occurs in standard DiD when treatment timing varies. The bias arises because treatment effects may change over time, and the new method constructs valid comparison groups of not-yet-treated units.

Population: Panel data settings with staggered treatment adoption (groups treated at different times) and multiple time periods · Comparator: Standard Difference-in-Differences (two-period, two-group) or older methods that compare treated vs untreated across all periods

Primary evidence

Difference-in-Differences with multiple time periods

High confidencePublished

Two-step estimation strategy (group-time average treatment effects) positive Validity of parallel trends assumption and effect estimate consistency

The two-step approach allows parallel trends to hold conditional on covariates and does not impose constant treatment effects over time, unlike two-way fixed effects which can produce weighted averages of treatment effects that may be negative even if all individual effects are positive.

Population: Staggered adoption designs with covariates · Comparator: Single-step regression with unit and time fixed effects (two-way fixed effects)

Primary evidence

Difference-in-Differences with multiple time periods

High confidencePublished

Differences-in-differences (DiD) positive Causal effect estimate validity under parallel trends

DiD removes bias from time-invariant unobserved confounders by comparing changes over time between treated and untreated groups, yielding credible causal estimates when parallel trends hold.

Population: Panel data settings with treatment and control groups · Comparator: Simple before-after or cross-sectional comparisons

Primary evidence

Mostly Harmless Econometrics: An Empiricist's Companion

DiD removes bias from time-invariant unobserved confounders by comparing changes over time between treated and untreated groups, yielding credible causal estimates when parallel trends hold.

High confidencePublished

Regression discontinuity (RD) positive Causal effect estimate validity near the cutoff

RD provides unbiased causal estimates for units near the cutoff by exploiting the discontinuity in treatment assignment, mimicking a local randomized experiment.

Population: Settings where treatment is assigned by a cutoff on a continuous variable · Comparator: Global regression or naive comparison of means

Primary evidence

Mostly Harmless Econometrics: An Empiricist's Companion

RD provides unbiased causal estimates for units near the cutoff by exploiting the discontinuity in treatment assignment, mimicking a local randomized experiment.

Evidence base

Min quality:

50 papers

BookWikiCanonicalHigh evidence score

Causal Inference: What If

Miguel A. Hernan, James M. Robins · Chapman & Hall/CRC · 2020

A book-length applied introduction to causal questions, target trials, time-varying treatment, confounding, selection, and potential-outcomes estimands.

Read the breakdown →

StudyPreprintWikiCanonicalModerate

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

Stefan Wager, Susan Athey · 2015

Many scientific and engineering challenges -- ranging from personalized medicine to customized marketing recommendations -- require an understanding of treatment effect heterogeneity. In this paper, we develop a non-parametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect, and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.

Read the breakdown →

StudyPreprintWikiCanonicalModerate

Double/Debiased Machine Learning for Treatment and Causal Parameters

Victor Chernozhukov, Denis Chetverikov, Mert Demirer +4 more · 2016

Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically deliver good estimators of causal parameters. Examples of such parameters include individual regression coefficients, average treatment effects, average lifts, and demand or supply elasticities. In fact, estimates of such causal parameters obtained via naively plugging ML estimators into estimating equations for such parameters can behave very poorly due to the regularization bias. Fortunately, this regularization bias can be removed by solving auxiliary prediction problems via ML tools. Specifically, we can form an orthogonal score for the target low-dimensional parameter by combining auxiliary and main ML predictions. The score is then used to build a de-biased estimator of the target parameter which typically will converge at the fastest possible 1/root(n) rate and be approximately unbiased and normal, and from which valid confidence intervals for these parameters of interest may be constructed. The resulting method thus could be called a "double ML" method because it relies on estimating primary and auxiliary predictive models. In order to avoid overfitting, our construction also makes use of the K-fold sample splitting, which we call cross-fitting. This allows us to use a very broad set of ML predictive methods in solving the auxiliary and main prediction problems, such as random forest, lasso, ridge, deep neural nets, boosted trees, as well as various hybrids and aggregators of these methods.

Read the breakdown →

RCTHigh evidence score

A Survey on Causal Inference

Liuyi Yao, Zhixuan Chu, Sheng Li +3 more · ACM Transactions on Knowledge Discovery from Data · 2021 · 437 citations

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy, and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well-known causal inference frameworks. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine, and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

Meta-analysisHigh evidence score

On the Nuisance of Control Variables in Causal Regression Analysis

Paul Hünermund, Beyers Louw · Organizational Research Methods · 2023 · 98 citations

Control variables are included in regression analyses to estimate the causal effect of a treatment on an outcome. In this article, we argue that the estimated effect sizes of controls are unlikely to have a causal interpretation themselves, though. This is because even valid controls are possibly endogenous and represent a combination of several different causal mechanisms operating jointly on the outcome, which is hard to interpret theoretically. Therefore, we recommend refraining from interpreting the marginal effects of controls and focusing on the main variables of interest, for which a plausible identification argument can be established. To prevent erroneous managerial or policy implications, coefficients of control variables should be clearly marked as not having a causal interpretation or omitted from regression tables altogether. Moreover, we advise against using control variable estimates for subsequent theory building and meta-analyses.

RCTHigh evidence score

Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects (with Discussion)

P. Richard Hahn, Jared S. Murray, Carlos M. Carvalho · Bayesian Analysis · 2020 · 303 citations

This paper presents a novel nonlinear regression model for estimating heterogeneous treatment effects, geared specifically towards situations with small effect sizes, heterogeneous effects, and strong confounding by observables. Standard nonlinear regression models, which may work quite well for prediction, have two notable weaknesses when used to estimate heterogeneous treatment effects. First, they can yield badly biased estimates of treatment effects when fit to data with strong confounding. The Bayesian causal forest model presented in this paper avoids this problem by directly incorporating an estimate of the propensity function in the specification of the response model, implicitly inducing a covariate-dependent prior on the regression function. Second, standard approaches to response surface modeling do not provide adequate control over the strength of regularization over effect heterogeneity. The Bayesian causal forest model permits treatment effect heterogeneity to be regularized separately from the prognostic effect of control variables, making it possible to informatively “shrink to homogeneity”. While we focus on observational data, our methods are equally useful for inferring heterogeneous treatment effects from randomized controlled experiments where careful regularization is somewhat less complicated but no less important. We illustrate these benefits via the reanalysis of an observational study assessing the causal effects of smoking on medical expenditures as well as extensive simulation studies.

StudyModerate

Revisiting Event-Study Designs: Robust and Efficient Estimation

Kirill Borusyak, Xavier Jaravel, Jann Spiess · The Review of Economic Studies · 2024 · 1,685 citations

Abstract We develop a framework for difference-in-differences designs with staggered treatment adoption and heterogeneous causal effects. We show that conventional regression-based estimators fail to provide unbiased estimates of relevant estimands absent strong restrictions on treatment-effect homogeneity. We then derive the efficient estimator addressing this challenge, which takes an intuitive “imputation” form when treatment-effect heterogeneity is unrestricted. We characterize the asymptotic behaviour of the estimator, propose tools for inference, and develop tests for identifying assumptions. Our method applies with time-varying controls, in triple-difference designs, and with certain non-binary treatments. We show the practical relevance of our results in a simulation study and an application. Studying the consumption response to tax rebates in the U.S., we find that the notional marginal propensity to consume is between 8 and 11% in the first quarter—about half as large as benchmark estimates used to calibrate macroeconomic models—and predominantly occurs in the first month after the rebate.

ObservationalModerate

A tutorial on propensity score estimation for multiple treatments using generalized boosted models

Daniel F. McCaffrey, Beth Ann Griffin, Daniel Almirall +3 more · Statistics in Medicine · 2013 · 1,479 citations

The use of propensity scores to control for pretreatment imbalances on observed variables in non-randomized or observational studies examining the causal effects of treatments or interventions has become widespread over the past decade. For settings with two conditions of interest such as a treatment and a control, inverse probability of treatment weighted estimation with propensity scores estimated via boosted models has been shown in simulation studies to yield causal effect estimates with desirable properties. There are tools (e.g., the twang package in R) and guidance for implementing this method with two treatments. However, there is not such guidance for analyses of three or more treatments. The goals of this paper are twofold: (1) to provide step-by-step guidance for researchers who want to implement propensity score weighting for multiple treatments and (2) to propose the use of generalized boosted models (GBM) for estimation of the necessary propensity score weights. We define the causal quantities that may be of interest to studies of multiple treatments and derive weighted estimators of those quantities. We present a detailed plan for using GBM to estimate propensity scores and using those scores to estimate weights and causal effects. We also provide tools for assessing balance and overlap of pretreatment variables among treatment groups in the context of multiple treatments. A case study examining the effects of three treatment programs for adolescent substance abuse demonstrates the methods.

StudyModerate

Double/debiased machine learning for treatment and structural parameters

Victor Chernozhukov, Denis Chetverikov, Mert Demirer +4 more · Econometrics Journal · 2017 · 2,378 citations

We revisit the classic semi‐parametric problem of inference on a low‐dimensional parameter θ0 in the presence of high‐dimensional nuisance parameters η0. We depart from the classical setting by allowing for η0 to be so high‐dimensional that the traditional assumptions (e.g. Donsker properties) that limit complexity of the parameter space for this object break down. To estimate η0, we consider the use of statistical or machine learning (ML) methods, which are particularly well suited to estimation in modern, very high‐dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η0 cause a heavy bias in estimators of θ0 that are obtained by naively plugging ML estimators of η0 into estimating equations for θ0. This bias results in the naive estimator failing to be N−1/2 consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman‐orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ0; (2) making use of cross‐fitting, which provides an efficient form of data‐splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in an N−1/2‐neighbourhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements, which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters, such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of the following: DML applied to learn the main regression parameter in a partially linear regression model; DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model; DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness; DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples.

StudyModerate

Mendelian randomisation for mediation analysis: current methods and challenges for implementation

Alice R Carter, Eleanor Sanderson, Gemma Hammerton +6 more · European Journal of Epidemiology · 2021 · 1,195 citations

Mediation analysis seeks to explain the pathway(s) through which an exposure affects an outcome. Traditional, non-instrumental variable methods for mediation analysis experience a number of methodological difficulties, including bias due to confounding between an exposure, mediator and outcome and measurement error. Mendelian randomisation (MR) can be used to improve causal inference for mediation analysis. We describe two approaches that can be used for estimating mediation analysis with MR: multivariable MR (MVMR) and two-step MR. We outline the approaches and provide code to demonstrate how they can be used in mediation analysis. We review issues that can affect analyses, including confounding, measurement error, weak instrument bias, interactions between exposures and mediators and analysis of multiple mediators. Description of the methods is supplemented by simulated and real data examples. Although MR relies on large sample sizes and strong assumptions, such as having strong instruments and no horizontally pleiotropic pathways, our simulations demonstrate that these methods are unaffected by confounders of the exposure or mediator and the outcome and non-differential measurement error of the exposure or mediator. Both MVMR and two-step MR can be implemented in both individual-level MR and summary data MR. MR mediation methods require different assumptions to be made, compared with non-instrumental variable mediation methods. Where these assumptions are more plausible, MR can be used to improve causal inference in mediation analysis.

ObservationalTop journalModerate

Recursive partitioning for heterogeneous causal effects

Susan Athey, Guido W. Imbens · Proceedings of the National Academy of Sciences · 2016 · 1,523 citations

In this paper we propose methods for estimating heterogeneity in causal effects in experimental and observational studies and for conducting hypothesis tests about the magnitude of differences in treatment effects across subsets of the population. We provide a data-driven approach to partition the data into subpopulations that differ in the magnitude of their treatment effects. The approach enables the construction of valid confidence intervals for treatment effects, even with many covariates relative to the sample size, and without "sparsity" assumptions. We propose an "honest" approach to estimation, whereby one sample is used to construct the partition and another to estimate treatment effects for each subpopulation. Our approach builds on regression tree methods, modified to optimize for goodness of fit in treatment effects and to account for honest estimation. Our model selection criterion anticipates that bias will be eliminated by honest estimation and also accounts for the effect of making additional splits on the variance of treatment effect estimates within each subpopulation. We address the challenge that the "ground truth" for a causal effect is not observed for any individual unit, so that standard approaches to cross-validation must be modified. Through a simulation study, we show that for our preferred method honest estimation results in nominal coverage for 90% confidence intervals, whereas coverage ranges between 74% and 84% for nonhonest approaches. Honest estimation requires estimating the model with a smaller sample size; the cost in terms of mean squared error of treatment effects for our preferred method ranges between 7-22%.

StudyModerate

A More Credible Approach to Parallel Trends

Ashesh Rambachan, Jonathan Roth · The Review of Economic Studies · 2023 · 1,117 citations

Abstract This paper proposes tools for robust inference in difference-in-differences and event-study designs where the parallel trends assumption may be violated. Instead of requiring that parallel trends holds exactly, we impose restrictions on how different the post-treatment violations of parallel trends can be from the pre-treatment differences in trends (“pre-trends”). The causal parameter of interest is partially identified under these restrictions. We introduce two approaches that guarantee uniformly valid inference under the imposed restrictions, and we derive novel results showing that they have desirable power properties in our context. We illustrate how economic knowledge can inform the restrictions on the possible violations of parallel trends in two economic applications. We also highlight how our approach can be used to conduct sensitivity analyses showing what causal conclusions can be drawn under various restrictions on the possible violations of the parallel trends assumption.

StudyModerate

Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs

Sebastián Calónico, Matias D. Cattaneo, Rocío Titiunik · Econometrica · 2014 · 2,987 citations

In the regression-discontinuity (RD) design, units are assigned to treatment based on whether their value of an observed covariate exceeds a known cutoff. In this design, local polynomial estimators are now routinely employed to construct confidence intervals for treatment effects. The performance of these confidence intervals in applications, however, may be seriously hampered by their sensitivity to the specific bandwidth employed. Available bandwidth selectors typically yield a “large” bandwidth, leading to data-driven confidence intervals that may be biased, with empirical coverage well below their nominal target. We propose new theory-based, more robust confidence interval estimators for average treatment effects at the cutoff in sharp RD, sharp kink RD, fuzzy RD, and fuzzy kink RD designs. Our proposed confidence intervals are constructed using a bias-corrected RD estimator together with a novel standard error estimator. For practical implementation, we discuss mean squared error optimal bandwidths, which are by construction not valid for conventional confidence intervals but are valid with our robust approach, and consistent standard error estimators based on our new variance formulas. In a special case of practical interest, our procedure amounts to running a quadratic instead of a linear local regression. More generally, our results give a formal justification to simple inference procedures based on increasing the order of the local polynomial estimator employed. We find in a simulation study that our confidence intervals exhibit close-to-correct empirical coverage and good empirical interval length on average, remarkably improving upon the alternatives available in the literature. All results are readily available in R and STATA using our companion software packages described in Calonico, Cattaneo, and Titiunik (2014d, 2014b).

StudyModerate

A tutorial on regularized partial correlation networks.

Sacha Epskamp, Eiko I. Fried · Psychological Methods · 2018 · 2,665 citations

Recent years have seen an emergence of network modeling applied to moods, attitudes, and problems in the realm of psychology. In this framework, psychological variables are understood to directly affect each other rather than being caused by an unobserved latent entity. In this tutorial, we introduce the reader to estimating the most popular network model for psychological data: the partial correlation network. We describe how regularization techniques can be used to efficiently estimate a parsimonious and interpretable network structure in psychological data. We show how to perform these analyses in R and demonstrate the method in an empirical example on posttraumatic stress disorder data. In addition, we discuss the effect of the hyperparameter that needs to be manually set by the researcher, how to handle non-normal data, how to determine the required sample size for a network analysis, and provide a checklist with potential solutions for problems that can arise when estimating regularized partial correlation networks. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

Systematic ReviewTop journalHigh evidence score

Causal Inference: A Missing Data Perspective

Peng Ding, Fan Li · Statistical Science · 2018 · 124 citations

Inferring causal effects of treatments is a central goal in many disciplines. The potential outcomes framework is a main statistical approach to causal inference, in which a causal effect is defined as a comparison of the potential outcomes of the same units under different treatment conditions. Because for each unit at most one of the potential outcomes is observed and the rest are missing, causal inference is inherently a missing data problem. Indeed, there is a close analogy in the terminology and the inferential framework between causal inference and missing data. Despite the intrinsic connection between the two subjects, statistical analyses of causal inference and missing data also have marked differences in aims, settings and methods. This article provides a systematic review of causal inference from the missing data perspective. Focusing on ignorable treatment assignment mechanisms, we discuss a wide range of causal inference methods that have analogues in missing data analysis, such as imputation, inverse probability weighting and doubly robust methods. Under each of the three modes of inference—Frequentist, Bayesian and Fisherian randomization—we present the general structure of inference for both finite-sample and super-population estimands, and illustrate via specific examples. We identify open questions to motivate more research to bridge the two fields.

ObservationalModerate

Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models

Yiqing Xu · Political Analysis · 2017 · 893 citations

Difference-in-differences (DID) is commonly used for causal inference in time-series cross-sectional data. It requires the assumption that the average outcomes of treated and control units would have followed parallel paths in the absence of treatment. In this paper, we propose a method that not only relaxes this often-violated assumption, but also unifies the synthetic control method (Abadie, Diamond, and Hainmueller 2010) with linear fixed effects models under a simple framework, of which DID is a special case. It imputes counterfactuals for each treated unit using control group information based on a linear interactive fixed effects model that incorporates unit-specific intercepts interacted with time-varying coefficients. This method has several advantages. First, it allows the treatment to be correlated with unobserved unit and time heterogeneities under reasonable modeling assumptions. Second, it generalizes the synthetic control method to the case of multiple treated units and variable treatment periods, and improves efficiency and interpretability. Third, with a built-in cross-validation procedure, it avoids specification searches and thus is easy to implement. An empirical example of Election Day Registration and voter turnout in the United States is provided.

RCTHigh evidence score

Using Machine Learning to Individualize Treatment Effect Estimation: Challenges and Opportunities

Alicia Curth, Richard Peck, Eoin McKinney +2 more · Clinical Pharmacology & Therapeutics · 2023 · 34 citations

The use of data from randomized clinical trials to justify treatment decisions for real-world patients is the current state of the art. It relies on the assumption that average treatment effects from the trial can be extrapolated to patients with personal and/or disease characteristics different from those treated in the trial. Yet, because of heterogeneity of treatment effects between patients and between the trial population and real-world patients, this assumption may not be correct for many patients. Using machine learning to estimate the expected conditional average treatment effect (CATE) in individual patients from observational data offers the potential for more accurate estimation of the expected treatment effects in each patient based on their observed characteristics. In this review, we discuss some of the challenges and opportunities for machine learning to estimate CATE, including ensuring identification assumptions are met, managing covariate shift, and learning without access to the true label of interest. We also discuss the potential applications as well as future work and collaborations needed to further improve identification and utilization of CATE estimates to increase patient benefit.

ObservationalHigh evidence score

An adversarial training framework for mitigating algorithmic biases in clinical machine learning

Jenny Yang, Andrew A. S. Soltan, David W. Eyre +2 more · npj Digital Medicine · 2023 · 140 citations

Machine learning is becoming increasingly prominent in healthcare. Although its benefits are clear, growing attention is being given to how these tools may exacerbate existing biases and disparities. In this study, we introduce an adversarial training framework that is capable of mitigating biases that may have been acquired through data collection. We demonstrate this proposed framework on the real-world task of rapidly predicting COVID-19, and focus on mitigating site-specific (hospital) and demographic (ethnicity) biases. Using the statistical definition of equalized odds, we show that adversarial training improves outcome fairness, while still achieving clinically-effective screening performances (negative predictive values >0.98). We compare our method to previous benchmarks, and perform prospective and external validation across four independent hospital cohorts. Our method can be generalized to any outcomes, models, and definitions of fairness.

RCTHigh evidence score

Machine learning approaches to evaluate heterogeneous treatment effects in randomized controlled trials: a scoping review

Kosuke Inoue, Motohiko Adomi, Orestis Efthimiou +7 more · Journal of Clinical Epidemiology · 2024 · 28 citations

StudyModerate

Recent Developments in the Econometrics of Program Evaluation

Guido W. Imbens, Jeffrey M. Wooldridge · Journal of Economic Literature · 2009 · 4,835 citations

Many empirical questions in economics and other social sciences depend on causal effects of programs or policies. In the last two decades, much research has been done on the econometric and statistical analysis of such causal effects. This recent theoretical literature has built on, and combined features of, earlier work in both the statistics and econometrics literatures. It has by now reached a level of maturity that makes it an important tool in many areas of empirical research in economics, including labor economics, public finance, development economics, industrial organization, and other areas of empirical microeconomics. In this review, we discuss some of the recent developments. We focus primarily on practical issues for empirical researchers, as well as provide a historical overview of the area and give references to more technical research.

ObservationalModerate

Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies

Jens Hainmueller · SSRN Electronic Journal · 2011 · 1,121 citations

RCTHigh evidence score

Balancing Unobserved Confounding with a Few Unbiased Ratings in Debiased Recommendations

Haoxuan Li, Yanghao Xiao, Chunyuan Zheng +1 more · 2023 · 33 citations

Recommender systems are seen as an effective tool to address information overload, but it is widely known that the presence of various biases makes direct training on large-scale observational data result in sub-optimal prediction performance. In contrast, unbiased ratings obtained from randomized controlled trials or A/B tests are considered to be the golden standard, but are costly and small in scale in reality. To exploit both types of data, recent works proposed to use unbiased ratings to correct the parameters of the propensity or imputation models trained on the biased dataset. However, the existing methods fail to obtain accurate predictions in the presence of unobserved confounding or model misspecification. In this paper, we propose a theoretically guaranteed model-agnostic balancing approach that can be applied to any existing debiasing method with the aim of combating unobserved confounding and model misspecification. The proposed approach makes full use of unbiased data by alternatively correcting model parameters learned with biased data, and adaptively learning balance coefficients of biased samples for further debiasing. Extensive real-world experiments are conducted along with the deployment of our proposal on four representative debiasing methods to demonstrate the effectiveness.

StudyModerate

Doubly robust difference-in-differences estimators

Pedro H. C. Sant’Anna, Jun B. Zhao · Journal of Econometrics · 2020 · 924 citations

StudyModerate

Smoothing Parameter and Model Selection for General Smooth Models

Simon N. Wood, Natalya Pya, Benjamin Säfken · Journal of the American Statistical Association · 2016 · 1,286 citations

This article discusses a general framework for smoothing parameter estimation for models with regular likelihoods\nconstructed in terms of unknown smooth functions of covariates. Gaussian random effects and\nparametric terms may also be present. By construction the method is numerically stable and convergent,\nand enables smoothing parameter uncertainty to be quantified. The latter enables us to fix a well known\nproblem with AIC for such models, thereby improving the range of model selection tools available. The\nsmooth functions are represented by reduced rank spline like smoothers, with associated quadratic penalties\nmeasuring function smoothness. Model estimation is by penalized likelihood maximization, where\nthe smoothing parameters controlling the extent of penalization are estimated by Laplace approximate\nmarginal likelihood. The methods cover, for example, generalized additive models for nonexponential family\nresponses (e.g., beta, ordered categorical, scaled t distribution, negative binomial and Tweedie distributions),\ngeneralized additive models for location scale and shape (e.g., two stage zero inflation models, and\nGaussian location-scale models), Cox proportional hazards models and multivariate additive models. The\nframework reduces the implementation of new model classes to the coding of some standard derivatives\nof the log-likelihood. Supplementary materials for this article are available online.

StudyTop journalModerate

Causal machine learning for predicting treatment outcomes

Stefan Feuerriegel, Dennis Frauen, Valentyn Melnychuk +7 more · Nature Medicine · 2024 · 248 citations

StudyModerate

Doubly Robust Estimation of Causal Effects

Michele Jönsson Funk, Daniel Westreich, Chris Wiesen +3 more · American Journal of Epidemiology · 2011 · 1,126 citations

Doubly robust estimation combines a form of outcome regression with a model for the exposure (i.e., the propensity score) to estimate the causal effect of an exposure on an outcome. When used individually to estimate a causal effect, both outcome regression and propensity score methods are unbiased only if the statistical model is correctly specified. The doubly robust estimator combines these 2 approaches such that only 1 of the 2 models need be correctly specified to obtain an unbiased effect estimator. In this introduction to doubly robust estimators, the authors present a conceptual overview of doubly robust estimation, a simple worked example, results from a simulation study examining performance of estimated and bootstrapped standard errors, and a discussion of the potential advantages and limitations of this method. The supplementary material for this paper, which is posted on the Journal's Web site (http://aje.oupjournals.org/), includes a demonstration of the doubly robust property (Web Appendix 1) and a description of a SAS macro (SAS Institute, Inc., Cary, North Carolina) for doubly robust estimation, available for download at http://www.unc.edu/~mfunk/dr/.

StudyModerate

Variable Selection for Propensity Score Models

M. Alan Brookhart, Sebastian Schneeweiß, Kenneth J. Rothman +3 more · American Journal of Epidemiology · 2006 · 2,306 citations

Despite the growing popularity of propensity score (PS) methods in epidemiology, relatively little has been written in the epidemiologic literature about the problem of variable selection for PS models. The authors present the results of two simulation studies designed to help epidemiologists gain insight into the variable selection problem in a PS analysis. The simulation studies illustrate how the choice of variables that are included in a PS model can affect the bias, variance, and mean squared error of an estimated exposure effect. The results suggest that variables that are unrelated to the exposure but related to the outcome should always be included in a PS model. The inclusion of these variables will decrease the variance of an estimated exposure effect without increasing bias. In contrast, including variables that are related to the exposure but not to the outcome will increase the variance of the estimated exposure effect without decreasing bias. In very small studies, the inclusion of variables that are strongly related to the exposure but only weakly related to the outcome can be detrimental to an estimate in a mean squared error sense. The addition of these variables removes only a small amount of bias but can increase the variance of the estimated exposure effect. These simulation studies and other analytical results suggest that standard model-building tools designed to create good predictive models of the exposure will not always lead to optimal PS models, particularly in small studies.

StudyModerate

Multivariate Adaptive Regression Splines

Jerome H. Friedman · The Annals of Statistics · 1991 · 8,065 citations

A new method is presented for flexible regression modeling of high dimensional data. The model takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations) are automatically determined by the data. This procedure is motivated by the recursive partitioning approach to regression and shares its attractive properties. Unlike recursive partitioning, however, this method produces continuous models with continuous derivatives. It has more power and flexibility to model relationships that are nearly additive or involve interactions in at most a few variables. In addition, the model can be represented in a form that separately identifies the additive contributions and those associated with the different multivariable interactions.

StudyModerate

Resting state network estimation in individual subjects

Carl D. Hacker, Timothy O. Laumann, Nicholas Szrama +4 more · NeuroImage · 2013 · 257 citations

ObservationalModerate

Causal Inference for Social Network Data

Elizabeth L. Ogburn, Oleg Sofrygin, Iván Díaz +1 more · Journal of the American Statistical Association · 2022 · 107 citations

We describe semiparametric estimation and inference for causal effects using observational data from a single social network. Our asymptotic results are the first to allow for dependence of each observation on a growing number of other units as sample size increases. In addition, while previous methods have implicitly permitted only one of two possible sources of dependence among social network observations, we allow for both dependence due to transmission of information across network ties and for dependence due to latent similarities among nodes sharing ties. We propose new causal effects that are specifically of interest in social network settings, such as interventions on network ties and network structure. We use our methods to reanalyze an influential and controversial study that estimated causal peer effects of obesity using social network data from the Framingham Heart Study; after accounting for network structure we find no evidence for causal peer effects.

ObservationalModerate

Generalizing Causal Inferences from Individuals in Randomized Trials to All Trial-Eligible Individuals

Issa J. Dahabreh, Sarah E. Robertson, Eric J. Tchetgen +2 more · Biometrics · 2018 · 149 citations

We consider methods for causal inference in randomized trials nested within cohorts of trial-eligible individuals, including those who are not randomized. We show how baseline covariate data from the entire cohort, and treatment and outcome data only from randomized individuals, can be used to identify potential (counterfactual) outcome means and average treatment effects in the target population of all eligible individuals. We review identifiability conditions, propose estimators, and assess the estimators' finite-sample performance in simulation studies. As an illustration, we apply the estimators in a trial nested within a cohort of trial-eligible individuals to compare coronary artery bypass grafting surgery plus medical therapy vs. medical therapy alone for chronic coronary artery disease.

StudyModerate

Constructing Inverse Probability Weights for Marginal Structural Models

Stephen R. Cole, Miguel A. Hernán · American Journal of Epidemiology · 2008 · 2,673 citations

The method of inverse probability weighting (henceforth, weighting) can be used to adjust for measured confounding and selection bias under the four assumptions of consistency, exchangeability, positivity, and no misspecification of the model used to estimate weights. In recent years, several published estimates of the effect of time-varying exposures have been based on weighted estimation of the parameters of marginal structural models because, unlike standard statistical methods, weighting can appropriately adjust for measured time-varying confounders affected by prior exposure. As an example, the authors describe the last three assumptions using the change in viral load due to initiation of antiretroviral therapy among 918 human immunodeficiency virus-infected US men and women followed for a median of 5.8 years between 1996 and 2005. The authors describe possible tradeoffs that an epidemiologist may encounter when attempting to make inferences. For instance, a tradeoff between bias and precision is illustrated as a function of the extent to which confounding is controlled. Weight truncation is presented as an informal and easily implemented method to deal with these tradeoffs. Inverse probability weighting provides a powerful methodological tool that may uncover causal effects of exposures that are otherwise obscured. However, as with all methods, diagnostics and sensitivity analyses are essential for proper use.

StudyModerate

Cluster-robust inference: A guide to empirical practice

James G. MacKinnon, Morten Ørregaard Nielsen, Matthew D. Webb · Journal of Econometrics · 2022 · 278 citations

Methods for cluster-robust inference are routinely used in economics and many other disciplines. However, it is only recently that theoretical foundations for the use of these methods in many empirically relevant situations have been developed. In this paper, we use these theoretical results to provide a guide to empirical practice. We do not attempt to present a comprehensive survey of the (very large) literature. Instead, we bridge theory and practice by providing a thorough guide on what to do and why, based on recently available econometric theory and simulation evidence. To practice what we preach, we include an empirical analysis of the effects of the minimum wage on labor supply of teenagers using individual data.

StudyModerate

Deep Neural Networks for Estimation and Inference

Max H. Farrell, Tengyuan Liang, Sanjog Misra · Econometrica · 2021 · 301 citations

We study deep neural networks and their use in semiparametric inference. We establish novel nonasymptotic high probability bounds for deep feedforward neural nets. These deliver rates of convergence that are sufficiently fast (in some cases minimax optimal) to allow us to establish valid second‐step inference after first‐step estimation with deep learning, a result also new to the literature. Our nonasymptotic high probability bounds, and the subsequent semiparametric inference, treat the current standard architecture: fully connected feedforward neural networks (multilayer perceptrons), with the now‐common rectified linear unit activation function, unbounded weights, and a depth explicitly diverging with the sample size. We discuss other architectures as well, including fixed‐width, very deep networks. We establish the nonasymptotic bounds for these deep nets for a general class of nonparametric regression‐type loss functions, which includes as special cases least squares, logistic regression, and other generalized linear models. We then apply our theory to develop semiparametric inference, focusing on causal parameters for concreteness, and demonstrate the effectiveness of deep learning with an empirical application to direct mail marketing.

StudyModerate

High-Dimensional Methods and Inference on Structural and Treatment Effects

Alexandre Belloni, Victor Chernozhukov, Christian Hansen · The Journal of Economic Perspectives · 2014 · 706 citations

Data with a large number of variables relative to the sample size—“high-dimensional data”—are readily available and increasingly common in empirical economics. Highdimensional data arise through a combination of two phenomena. First, the data may be inherently high dimensional in that many different characteristics per observation are available. For example, the US Census collects information on hundreds of individual characteristics and scanner datasets record transaction-level data for households across a wide range of products. Second, even when the number of available variables is relatively small, researchers rarely know the exact functional form with which the small number of variables enter the model of interest. Researchers are thus faced with a large set of potential variables formed by different ways of interacting and transforming the underlying variables. This paper provides an overview of how innovations in “data mining” can be adapted and modified to provide high-quality inference about model parameters. Note that we use the term “data mining” in a modern sense which denotes a principled search for “true” predictive power that guards against false discovery and overfitting, does not erroneously equate in-sample fit to out-of-sample predictive ability, and accurately accounts for using the same data to examine many different hypotheses or models.

StudyModerate

Smoothing Parameter and Model Selection for General Smooth Models

Simon N. Wood, Natalya Pya, Benjamin Säfken · Bristol Research (University of Bristol) · 2017 · 534 citations

This article discusses a general framework for smoothing parameter estimation for models with regular likelihoods constructed in terms of unknown smooth functions of covariates. Gaussian random effects and parametric terms may also be present. By construction the method is numerically stable and convergent, and enables smoothing parameter uncertainty to be quantified. The latter enables us to fix a well known problem with AIC for such models, thereby improving the range of model selection tools available. The smooth functions are represented by reduced rank spline like smoothers, with associated quadratic penalties measuring function smoothness. Model estimation is by penalized likelihood maximization, where the smoothing parameters controlling the extent of penalization are estimated by Laplace approximate marginal likelihood. The methods cover, for example, generalized additive models for nonexponential family responses (e.g., beta, ordered categorical, scaled t distribution, negative binomial and Tweedie distributions), generalized additive models for location scale and shape (e.g., two stage zero inflation models, and Gaussian location-scale models), Cox proportional hazards models and multivariate additive models. The framework reduces the implementation of new model classes to the coding of some standard derivatives of the log-likelihood. Supplementary materials for this article are available online.

ObservationalModerate

Semiparametric Proximal Causal Inference

Yifan Cui, Hongming Pu, Xu Shi +2 more · Journal of the American Statistical Association · 2023 · 46 citations

Skepticism about the assumption of no unmeasured confounding, also known as exchangeability, is often warranted in making causal inferences from observational data; because exchangeability hinges on an investigator’s ability to accurately measure covariates that capture all potential sources of confounding. In practice, the most one can hope for is that covariate measurements are at best proxies of the true underlying confounding mechanism operating in a given observational study. In this paper, we consider the framework of proximal causal inference introduced by Miao et al. (2018); Tchetgen Tchetgen et al. (2020), which while explicitly acknowledging covariate measurements as imperfect proxies of confounding mechanisms, offers an opportunity to learn about causal effects in settings where exchangeability on the basis of measured covariates fails. We make a number of contributions to proximal inference including (i) an alternative set of conditions for nonparametric proximal identification of the average treatment effect; (ii) general semiparametric theory for proximal estimation of the average treatment effect including efficiency bounds for key semiparametric models of interest; (iii) a characterization of proximal doubly robust and locally efficient estimators of the average treatment effect. Moreover, we provide analogous identification and efficiency results for the average treatment effect on the treated. Our approach is illustrated via simulation studies and a data application on evaluating the effectiveness of right heart catheterization in the intensive care unit of critically ill patients.

StudyModerate

Using Stacking to Average Bayesian Predictive Distributions (with Discussion)

Yuling Yao, Aki Vehtari, Daniel Simpson +1 more · Bayesian Analysis · 2018 · 473 citations

Bayesian model averaging is flawed in the M-open setting in which the true data-generating process is not one of the candidate models being fit. We take the idea of stacking from the point estimation literature and generalize to the combination of predictive distributions. We extend the utility function to any proper scoring rule and use Pareto smoothed importance sampling to efficiently compute the required leave-one-out posterior distributions. We compare stacking of predictive distributions to several alternatives: stacking of means, Bayesian model averaging (BMA), Pseudo-BMA, and a variant of Pseudo-BMA that is stabilized using the Bayesian bootstrap. Based on simulations and real-data applications, we recommend stacking of predictive distributions, with bootstrapped-Pseudo-BMA as an approximate alternative when computation cost is an issue.

ObservationalModerate

Introduction to computational causal inference using reproducible Stata, R, and Python code: A tutorial

Matthew J. Smith, Mohammad Alì Mansournia, Camille Maringe +6 more · Statistics in Medicine · 2021 · 72 citations

The main purpose of many medical studies is to estimate the effects of a treatment or exposure on an outcome. However, it is not always possible to randomize the study participants to a particular treatment, therefore observational study designs may be used. There are major challenges with observational studies; one of which is confounding. Controlling for confounding is commonly performed by direct adjustment of measured confounders; although, sometimes this approach is suboptimal due to modeling assumptions and misspecification. Recent advances in the field of causal inference have dealt with confounding by building on classical standardization methods. However, these recent advances have progressed quickly with a relative paucity of computational-oriented applied tutorials contributing to some confusion in the use of these methods among applied researchers. In this tutorial, we show the computational implementation of different causal inference estimators from a historical perspective where new estimators were developed to overcome the limitations of the previous estimators (ie, nonparametric and parametric g-formula, inverse probability weighting, double-robust, and data-adaptive estimators). We illustrate the implementation of different methods using an empirical example from the Connors study based on intensive care medicine, and most importantly, we provide reproducible and commented code in Stata, R, and Python for researchers to adapt in their own observational study. The code can be accessed at https://github.com/migariane/Tutorial_Computational_Causal_Inference_Estimators.

ObservationalModerate

Propensity score weighting for causal inference with multiple treatments

Fan Li, Li Fan · The Annals of Applied Statistics · 2019 · 153 citations

Causal or unconfounded descriptive comparisons between multiple groups are common in observational studies. Motivated from a racial disparity study in health services research, we propose a unified propensity score weighting framework, the balancing weights, for estimating causal effects with multiple treatments. These weights incorporate the generalized propensity scores to balance the weighted covariate distribution of each treatment group, all weighted toward a common prespecified target population. The class of balancing weights include several existing approaches such as the inverse probability weights and trimming weights as special cases. Within this framework, we propose a set of target estimands based on linear contrasts. We further develop the generalized overlap weights, constructed as the product of the inverse probability weights and the harmonic mean of the generalized propensity scores. The generalized overlap weighting scheme corresponds to the target population with the most overlap in covariates across the multiple treatments. These weights are bounded and thus bypass the problem of extreme propensities. We show that the generalized overlap weights minimize the total asymptotic variance of the moment weighting estimators for the pairwise contrasts within the class of balancing weights. We consider two balance check criteria and propose a new sandwich variance estimator for estimating the causal effects with generalized overlap weights. We apply these methods to study the racial disparities in medical expenditure between several racial groups using the 2009 Medical Expenditure Panel Survey (MEPS) data. Simulations were carried out to compare with existing methods.

ObservationalModerate

Multiply Robust Causal Inference with Double-Negative Control Adjustment for Categorical Unmeasured Confounding

Xu Shi, Wang Miao, Jennifer C. Nelson +1 more · Journal of the Royal Statistical Society Series B (Statistical Methodology) · 2020 · 101 citations

Unmeasured confounding is a threat to causal inference in observational studies. In recent years, the use of negative controls to mitigate unmeasured confounding has gained increasing recognition and popularity. Negative controls have a long-standing tradition in laboratory sciences and epidemiology to rule out non-causal explanations, although they have been used primarily for bias detection. Recently, Miao and colleagues have described sufficient conditions under which a pair of negative control exposure and outcome variables can be used to identify non-parametrically the average treatment effect (ATE) from observational data subject to uncontrolled confounding. We establish non-parametric identification of the ATE under weaker conditions in the case of categorical unmeasured confounding and negative control variables. We also provide a general semiparametric framework for obtaining inferences about the ATE while leveraging information about a possibly large number of measured covariates. In particular, we derive the semiparametric efficiency bound in the non-parametric model, and we propose multiply robust and locally efficient estimators when non-parametric estimation may not be feasible. We assess the finite sample performance of our methods in extensive simulation studies. Finally, we illustrate our methods with an application to the post-licensure surveillance of vaccine safety among children.

RCTHigh evidence score

Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data

Steve Yadlowsky, Fabio Pellegrini, Federica Lionetto +2 more · Journal of the American Statistical Association · 2020 · 15 citations

While sample sizes in randomized clinical trials are large enough to estimate the average treatment effect well, they are often insufficient for estimation of treatment-covariate interactions critical to studying data-driven precision medicine. Observational data from real world practice may play an important role in alleviating this problem. One common approach in trials is to predict the outcome of interest with separate regression models in each treatment arm, and estimate the treatment effect based on the contrast of the predictions. Unfortunately, this simple approach may induce spurious treatment-covariate interaction in observational studies when the regression model is misspecified. Motivated by the need of modeling the number of relapses in multiple sclerosis patients, where the ratio of relapse rates is a natural choice of the treatment effect, we propose to estimate the conditional average treatment effect (CATE) as the ratio of expected potential outcomes, and derive a doubly robust estimator of this CATE in a semiparametric model of treatment-covariate interactions. We also provide a validation procedure to check the quality of the estimator on an independent sample. We conduct simulations to demonstrate the finite sample performance of the proposed methods, and illustrate their advantages on real data by examining the treatment effect of dimethyl fumarate compared to teriflunomide in multiple sclerosis patients.

StudyLeading journalModerate

The Decoding Toolbox (TDT): a versatile software package for multivariate analyses of functional imaging data

Martin N. Hebart, Kai GÃ¶rgen, John–Dylan Haynes · Frontiers in Neuroinformatics · 2015 · 508 citations

The multivariate analysis of brain signals has recently sparked a great amount of interest, yet accessible and versatile tools to carry out decoding analyses are scarce. Here we introduce The Decoding Toolbox (TDT) which represents a user-friendly, powerful and flexible package for multivariate analysis of functional brain imaging data. TDT is written in Matlab and equipped with an interface to the widely used brain data analysis package SPM. The toolbox allows running fast whole-brain analyses, region-of-interest analyses and searchlight analyses, using machine learning classifiers, pattern correlation analysis, or representational similarity analysis. It offers automatic creation and visualization of diverse cross-validation schemes, feature scaling, nested parameter selection, a variety of feature selection methods, multiclass capabilities, and pattern reconstruction from classifier weights. While basic users can implement a generic analysis in one line of code, advanced users can extend the toolbox to their needs or exploit the structure to combine it with external high-performance classification toolboxes. The toolbox comes with an example data set which can be used to try out the various analysis methods. Taken together, TDT offers a promising option for researchers who want to employ multivariate analyses of brain activity patterns.

ObservationalModerate

Maximum Likelihood, Profile Likelihood, and Penalized Likelihood: A Primer

Stephen R. Cole, Haitao Chu, Sander Greenland · American Journal of Epidemiology · 2013 · 224 citations

The method of maximum likelihood is widely used in epidemiology, yet many epidemiologists receive little or no education in the conceptual underpinnings of the approach. Here we provide a primer on maximum likelihood and some important extensions which have proven useful in epidemiologic research, and which reveal connections between maximum likelihood and Bayesian methods. For a given data set and probability model, maximum likelihood finds values of the model parameters that give the observed data the highest probability. As with all inferential statistical methods, maximum likelihood is based on an assumed model and cannot account for bias sources that are not controlled by the model or the study design. Maximum likelihood is nonetheless popular, because it is computationally straightforward and intuitive and because maximum likelihood estimators have desirable large-sample properties in the (largely fictitious) case in which the model has been correctly specified. Here, we work through an example to illustrate the mechanics of maximum likelihood estimation and indicate how improvements can be made easily with commercial software. We then describe recent extensions and generalizations which are better suited to observational health research and which should arguably replace standard maximum likelihood as the default method.

StudyLeading journalModerate

Causal Inference in Multisensory Perception

Konrad P. Körding, Ulrik Beierholm, Wei Ji +3 more · PLoS ONE · 2007 · 1,131 citations

Perceptual events derive their significance to an animal from their meaning about the world, that is from the information they carry about their causes. The brain should thus be able to efficiently infer the causes underlying our sensory events. Here we use multisensory cue combination to study causal inference in perception. We formulate an ideal-observer model that infers whether two sensory cues originate from the same location and that also estimates their location(s). This model accurately predicts the nonlinear integration of cues by human subjects in two auditory-visual localization tasks. The results show that indeed humans can efficiently infer the causal structure as well as the location of causes. By combining insights from the study of causal inference with the ideal-observer approach to sensory cue combination, we show that the capacity to infer causal structure is not limited to conscious, high-level cognition; it is also performed continually and effortlessly in perception.

StudyLeading journalModerate

Applying the Model-Comparison Approach to Test Specific Research Hypotheses in Psychophysical Research Using the Palamedes Toolbox

Nicolaas Prins, Frederick A. A. Kingdom · Frontiers in Psychology · 2018 · 476 citations

In the social sciences it is common practice to test specific theoretically motivated research hypotheses using formal statistical procedures. Typically, students in these disciplines are trained in such methods starting at an early stage in their academic tenure. On the other hand, in psychophysical research, where parameter estimates are generally obtained using a maximum-likelihood (ML) criterion and data do not lend themselves well to the least-squares methods taught in introductory courses, it is relatively uncommon to see formal model comparisons performed. Rather, it is common practice to estimate the parameters of interest (e.g., detection thresholds) and their standard errors individually across the different experimental conditions and to 'eyeball' whether the observed pattern of parameter estimates supports or contradicts some proposed hypothesis. We believe that this is at least in part due to a lack of training in the proper methodology as well as a lack of available software to perform such model comparisons when ML estimators are used. We introduce here a relatively new toolbox of Matlab routines called Palamedes which allows users to perform sophisticated model comparisons. In Palamedes, we implement the model-comparison approach to hypothesis testing. This approach allows researchers considerable flexibility in targeting specific research hypotheses. We discuss in a non-technical manner how this method can be used to perform statistical model comparisons when ML estimators are used. With Palamedes we hope to make sophisticated statistical model comparisons available to researchers who may not have the statistical background or the programming skills to perform such model comparisons from scratch. Note that while Palamedes is specifically geared toward psychophysical data, the core ideas behind the model-comparison approach that our paper discusses generalize to any field in which statistical hypotheses are tested.

ObservationalModerate

Semiparametric theory for causal mediation analysis: Efficiency bounds, multiple robustness and sensitivity analysis

Eric J. Tchetgen Tchetgen, Ilya Shpitser · The Annals of Statistics · 2012 · 288 citations

Whilst estimation of the marginal (total) causal effect of a point exposure on an outcome is arguably the most common objective of experimental and observational studies in the health and social sciences, in recent years, investigators have also become increasingly interested in mediation analysis. Specifically, upon evaluating the total effect of the exposure, investigators routinely wish to make inferences about the direct or indirect pathways of the effect of the exposure not through or through a mediator variable that occurs subsequently to the exposure and prior to the outcome. Although powerful semiparametric methodologies have been developed to analyze observational studies, that produce double robust and highly efficient estimates of the marginal total causal effect, similar methods for mediation analysis are currently lacking. Thus, this paper develops a general semiparametric framework for obtaining inferences about so-called marginal natural direct and indirect causal effects, while appropriately accounting for a large number of pre-exposure confounding factors for the exposure and the mediator variables. Our analytic framework is particularly appealing, because it gives new insights on issues of efficiency and robustness in the context of mediation analysis. In particular, we propose new multiply robust locally efficient estimators of the marginal natural indirect and direct causal effects, and develop a novel double robust sensitivity analysis framework for the assumption of ignorability of the mediator variable.

StudyModerate

Locally Robust Semiparametric Estimation

Victor Chernozhukov, Juan Carlos Escanciano, Hidehiko Ichimura +2 more · Econometrica · 2022 · 122 citations

Many economic and causal parameters depend on nonparametric or high dimensional first steps. We give a general construction of locally robust/orthogonal moment functions for GMM, where first steps have no effect, locally, on average moment functions. Using these orthogonal moments reduces model selection and regularization bias, as is important in many applications, especially for machine learning first steps. Also, associated standard errors are robust to misspecification when there is the same number of moment functions as parameters of interest. We use these orthogonal moments and cross‐fitting to construct debiased machine learning estimators of functions of high dimensional conditional quantiles and of dynamic discrete choice parameters with high dimensional state variables. We show that additional first steps needed for the orthogonal moment functions have no effect, globally, on average orthogonal moment functions. We give a general approach to estimating those additional first steps. We characterize double robustness and give a variety of new doubly robust moment functions. We give general and simple regularity conditions for asymptotic theory.

StudyModerate

Machine learning in agricultural and applied economics

Hugo Storm, Kathy Baylis, Thomas Heckelei · European Review of Agricultural Economics · 2019 · 211 citations

Abstract This review presents machine learning (ML) approaches from an applied economist’s perspective. We first introduce the key ML methods drawing connections to econometric practice. We then identify current limitations of the econometric and simulation model toolbox in applied economics and explore potential solutions afforded by ML. We dive into cases such as inflexible functional forms, unstructured data sources and large numbers of explanatory variables in both prediction and causal analysis, and highlight the challenges of complex simulation models. Finally, we argue that economists have a vital role in addressing the shortcomings of ML when used for quantitative economic analysis.

RCTHigh evidence score

Generalizing Trial Evidence to Target Populations in Non-Nested Designs: Applications to AIDS Clinical Trials

Fan Li, Ashley Buchanan, Stephen R. Cole · Journal of the Royal Statistical Society Series C (Applied Statistics) · 2022 · 12 citations

Comparative effectiveness evidence from randomized trials may not be directly generalizable to a target population of substantive interest when, as in most cases, trial participants are not randomly sampled from the target population. Motivated by the need to generalize evidence from two trials conducted in the AIDS Clinical Trials Group (ACTG), we consider weighting, regression and doubly robust estimators to estimate the causal effects of HIV interventions in a specified population of people living with HIV in the USA. We focus on a non-nested trial design and discuss strategies for both point and variance estimation of the target population average treatment effect. Specifically in the generalizability context, we demonstrate both analytically and empirically that estimating the known propensity score in trials does not increase the variance for each of the weighting, regression and doubly robust estimators. We apply these methods to generalize the average treatment effects from two ACTG trials to specified target populations and operationalize key practical considerations. Finally, we report on a simulation study that investigates the finite-sample operating characteristics of the generalizability estimators and their sandwich variance estimators.