StudyPreprintWikiCanonicalModerate
Double/Debiased Machine Learning for Treatment and Causal Parameters
Victor Chernozhukov, Denis Chetverikov, Mert Demirer +4 more · 2016 · 154 citations
Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically deliver good estimators of causal parameters. Examples of such parameters include individual regression coefficients, average treatment effects, average lifts, and demand or supply elasticities. In fact, estimates of such causal parameters obtained via naively plugging ML estimators into estimating equations for such parameters can behave very poorly due to the regularization bias. Fortunately, this regularization bias can be removed by solving auxiliary prediction problems via ML tools. Specifically, we can form an orthogonal score for the target low-dimensional parameter by combining auxiliary and main ML predictions. The score is then used to build a de-biased estimator of the target parameter which typically will converge at the fastest possible 1/root(n) rate and be approximately unbiased and normal, and from which valid confidence intervals for these parameters of interest may be constructed. The resulting method thus could be called a "double ML" method because it relies on estimating primary and auxiliary predictive models. In order to avoid overfitting, our construction also makes use of the K-fold sample splitting, which we call cross-fitting. This allows us to use a very broad set of ML predictive methods in solving the auxiliary and main prediction problems, such as random forest, lasso, ridge, deep neural nets, boosted trees, as well as various hybrids and aggregators of these methods.
Read the breakdown →StudyPreprintModerate
Sparse Latent Class Analysis: Post-Estimation Refinement via Item-level Pseudo-Likelihood
Yuxuan Xu, Lea Kaufmann, Yunxiao Chen +2 more · 2026
Latent Class Analysis (LCA) is widely used to identify unobserved subgroups in social and behavioural sciences. A long-standing challenge for LCA is the interpretability of the latent classes, due to the high complexity of the estimated item response probability matrix. To address this, we propose a computationally efficient post-estimation refinement procedure that enhances model interpretability by a sparse model estimate. The method begins by estimating a classical, unrestricted, latent class model and determining the number of classes using the Bayesian information criterion (BIC). It is followed by a refinement step that further performs model selection on the item-specific response probabilities based on the initial estimate. This refinement penalises the number of distinct response probability levels per item, collapsing redundant levels to yield a sparse matrix that is significantly easier to interpret than those produced by classical LCA. We provide asymptotic theory showing that the proposed procedure consistently recovers the sparse pattern of the item response probabilities for each item, and further validate its performance through extensive simulations. The practical power of the proposed method is further illustrated via an application to survey data on social role performance, where it provides a parsimonious and clear characterisation of the resulting latent classes. The code for implementing the proposed method is publicly available at https://github.com/florence07/Sparse-LCA-Refinement.
StudyPreprintWikiModerate
Selecting Informative Conformal Prediction Sets with an Optimized FCR-Controlled Approach
Israela Solomon, Etienne Roquain, Saharon Rosset +1 more · 2026
Conformal methods provide prediction sets for outcomes with confidence guarantees. We study their use in a selective inference setting, where inference is performed only when the prediction set is informative. The analyst may consider as informative, for example, cases with prediction sets that are sufficiently small, exclude null values, or satisfy other appropriate monotone constraints. Because inference is typically restricted to informative cases in practical applications, accounting for the resulting selection bias is crucial to maintaining false coverage rate (FCR) control. A general framework for constructing such informative conformal prediction sets while controlling the FCR on the selected sample was suggested in Gazin et al. (2025). In this work we focus on oracle-guided procedures. We derive the optimal decision policy under a suitable power objective in the oracle setting where the probability of belonging to each prediction set can be computed. In practice, of course, only estimated probabilities are available. We therefore introduce a calibration procedure that adjusts the oracle policy to maintain finite sample FCR control. We show that this approach can achieve substantially higher power than available alternatives. We demonstrate the effectiveness of our new methods for classification outcomes on both real and simulated data.
Read the breakdown →StudyPreprintModerate
Topics in Nonparametric Bayesian Statistics
Nils Lid Hjort · 2026
The intersection set of Bayesian and nonparametric statistics was almost empty until about 1973, but now is growing at a healthy rate. This chapter, for the {\it Highly Structured Stochastic Systems} book (Oxford University Press, 2003) gives an overview of various theoretical and applied research themes inside this field, partly complementing and extending recent reviews of Dey, M{ü}ller and Sinha (1998) and Walker, Damien, Laud and Smith (1999). The intention is not to be complete or exhaustive, but rather to touch on research areas of interest, partly by example.
StudyPreprintWikiModerate
New Confidence Regions for Linear Regression Parameters with Stationary-Ergodic Dependent Errors
Mous-Abou Hamadou, Martial Longla, Mathias Nthiani Muia +1 more · 2026
We develop joint confidence regions for linear regression coefficients when the regressors and errors are jointly stationary and ergodic with unspecified serial dependence. The method applies random smoothing, using an independent auxiliary sample and shrinking bandwidth, to a vector of regression and second-moment statistics. Under stationarity, ergodicity, and finite second moments, the estimator is asymptotically normal and yields Wald confidence regions and simultaneous confidence intervals without direct long-run variance estimation or a parametric dependence model. For implementation, we introduce a scaled estimator with data-driven bandwidth selection and a mild truncation that improves finite-sample stability. Simulations under ARMA, ARFIMA, copula-based Markov errors, and fractional Gaussian noise, with Gaussian and heavy-tailed margins, show near-nominal coverage and competitive region volumes relative to Newey-West HAC and MAC. A winter Beijing PM2.5 application illustrates the procedure. Keywords: Random smoothing, Joint inference, Confidence regions, Dependent errors, Long memory, Regression inference
Read the breakdown →StudyPreprintModerate
Conditional Predictive Inference for General Structured Data with Group Symmetries
Yichen Shen, Mengxin Yu · 2026
We study distribution-free predictive inference for data with group symmetries, aiming to establish near-conditional coverage guarantees beyond exchangeability for structured data. While many predictive inference methods achieve a target coverage level, most provide marginal coverage. In practice, conditional predictive inference is often preferred, as it quantifies uncertainty for black-box predictions given observed attributes, thereby accommodating heterogeneity. Although many efforts have pursued efficient conditional coverage, existing methods rely on the i.i.d. or exchangeable assumption, often violated in structured settings such as networks, clusters, and imaging data. Recently, SymmPI introduced a unified approach to predictive inference under group symmetries beyond exchangeability; nevertheless, its guarantees remain marginal and do not account for population heterogeneity. To bridge this gap, we introduce C-SymmPI, a framework that achieves near-conditional coverage under general data structures with group symmetries, extending beyond exchangeability to cover networks, cluster-level data, and related structures. Inspired by relaxed multi-accuracy, our approach reformulates conditional coverage as miscoverage error over a user-specified function class. We establish theoretical guarantees under distributional invariance and distribution shift, and derive convergence rates for linear and RKHS function classes, recovering state-of-the-art results in the exchangeable setting as special cases. For computational efficiency, we develop two variants: a projection-based algorithm for high-dimensional observations, and a sampling-based algorithm for large or infinite groups. We demonstrate effectiveness on hierarchical and network data. Empirical results show that C-SymmPI delivers more informative and stable conditional coverage with improved accuracy compared to existing methods.
StudyPreprintWikiModerate
A Scalable Parametric Item Calibration Engine (SPICE) for Explanatory IRT with Sparse Data
Steven W. Nydick, Manqian Liao, J. R. Lockwood · 2026
We describe a Bayesian multidimensional explanatory IRT model, and an associated Markov Chain Monte Carlo (MCMC) estimation procedure and the corresponding development of calibration software, designed for psychometric analyses of large numbers of sparsely-linked persons and items. Such data structures can arise, for example, from adaptive assessments using large banks of automatically generated items with individual test takers receiving a very small proportion of the entire bank. We discuss how our choices for model specification, data structures, and algorithm implementation combine to create a scalable method for explanatory IRT that can support a variety of psychometric operations with sparse data.
Read the breakdown →StudyPreprintWikiModerate
Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference
Ziang Song, Ying Jin, Emmanuel J. Candès · 2026
Modern applications of conformal inference to multiple testing problems, such as outlier detection and candidate selection, often involve selecting test samples whose conformal p-values fall below a threshold. The quality of such methods is often measured by the false discovery proportion (FDP), defined as the fraction of incorrect selections. Existing approaches typically control the expected value of the FDP, using methods such as the Benjamini-Hochberg procedure. This approach fails to provide high-probability bounds on the realized false discovery proportion and invalidates statistical guarantees if the rejection threshold is selected after inspecting the data. This paper establishes finite-sample, distribution-free upper bounds on the FDP that hold simultaneously over all possible rejection thresholds, enabling arbitrary post hoc selection of the threshold. Simultaneous validity is achieved by constructing a high-probability envelope for the empirical distribution function of null conformal p-values by sampling from their joint distribution. Furthermore, our framework allows practitioners to modulate the envelope's shape, thereby producing tight bounds in rejection regions of primary interest. We use this flexible approach to derive simultaneous FDP upper bounds for both outlier detection and conformal selection. We demonstrate through synthetic and real-data experiments that the resulting bounds are both valid and substantially less conservative than those derived from existing approaches.
Read the breakdown →StudyPreprintWikiModerate
Distribution-free root cause analysis
Rohan Hore, Aaditya Ramdas · 2026
We study distribution-free root cause analysis in multi-stream data, where an evolving underlying system is observed through multiple data streams that may each undergo distributional changes at unknown timepoints. In such settings, the stream exhibiting the earliest change provides a natural starting point for investigating the underlying cause, which we refer to as the root-cause index. Leveraging conformal $p$-values, we propose a novel framework, Conformal Root Cause Analysis (CROC), which constructs finite-sample valid confidence sets for the root-cause index under minimal assumptions: the data streams are independent, and within each stream the pre- and post-change observations are sampled exchangeably from arbitrary and unknown distributions. We further establish a universality property, showing that any distribution-free method for root cause localization can be represented within the CROC framework. In addition, under mild regularity conditions and principled score design, our method yields asymptotically sharp confidence sets that efficiently isolate the root cause. We further extend CROC to efficiently handle cross-stream dependence when present. Extensive simulations demonstrate accurate localization of the root stream, supporting our theoretical guarantees.
Read the breakdown →StudyPreprintModerate
Ranking with Confidence: A Probabilistic Framework for Deterministic Ranking Methods
Shunpu Zhang · 2026
Rankings are central to decision-making in fields ranging from education to online platforms, yet classical deterministic methods such as the Borda count method or Copeland-type pairwise methods ignore uncertainty due to sampling noise or incomplete data. We propose a probabilistic framework that treats true ranks as latent random variables, enabling quantification of ranking uncertainty. We introduce new ranking criteria based on pairwise dominance probabilities, derive approximate inference procedures, and provide a novel Worst Best rank method to construct simultaneous and individual confidence intervals for ranks. Our approach is the first to provide formal uncertainty quantification for classical deterministic rankings. It is inherently robust to missing data: unlike Copeland type methods, which penalize entities with fewer observed comparisons by assigning them fewer wins, our pairwise probability model adjusts for incompleteness, eliminating bias toward items with more complete records. The resulting rankings reflect underlying performance rather than data availability, enhancing fairness, transparency, and statistical reliability in high-stakes applications.
StudyPreprintWikiModerate
Optimal Sampling for Kernel Quadrature on Unbounded Domains
Edoardo Bandoni, Christian Robert, Julien Stoehr · 2026
Kernel quadrature is widely used to approximate integrals of smooth functions, with worst-case error typically decaying at the minimax rate $n^{-α/d}$ for smoothness $α$ in dimension $d$. Existing rate-optimal methods often depend on deterministic point sets tailored to a specific kernel, making them sensitive to misspecification and less robust in practice. In this work, we study randomized quadrature methods with a focus on robustness rather than kernel-specific optimality. We construct an explicit, $n$-dependent sampling distribution that achieves minimax rates for worst-case error over smoothness classes without requiring knowledge of the kernel. This kernel-agnostic design improves robustness while retaining optimal rates. Our analysis includes unbounded sampling measures such as Gaussian and Student-$t$ distributions, extending beyond compact domains. The results provide both theoretical guarantees and a practical recipe for robust, rate-optimal randomized quadrature.
Read the breakdown →StudyPreprintModerate
Bayesian Nonparametrics: Principles and Practice
Nils Lid Hjort, Chris Holmes, Peter Mueller +1 more · 2026
This extended preface [to the Book `Bayesian Nonparametrics', Cambridge University Press, 2010, by NL Hjort, CC Holmes, P Mueller, SG Walker] is meant to explain why you are right to be curious about Bayesian nonparametrics -- why you may actually need it and how you can manage to understand it and use it. The preface also serves as an introductory chapter, giving an overview of the aims and contents of the book. We also explain the background for how the book came into existence, delve briefly on the history of the still relatively young field of Bayesian nonparametrics, and offer some concluding remarks, pertaining to various challenges and likely future developments of the area.
StudyPreprintWikiModerate
CASCADE Conformal Prediction: Uncertainty-Adaptive Prediction Intervals for Two-Stage Clinical Decision Support
Ricardo Diaz-Rincon, Muxuan Liang, Adolfo Ramirez-Zamora +1 more · 2026
Effective medication management in Parkinson's Disease (PD) is challenging due to heterogeneous disease progression, variable patient response, and medication side effects. While AI models can forecast levodopa equivalent daily dose (LEDD) as a measure of medication needs, standard uncertainty quantification often fails to communicate the reliability of these predictions, treating high and low confidence clinical decisions identically. We introduce CASCADE (Calibrated Adaptive Scaling via Conformal And Distributional Estimation), a novel conformal prediction framework that propagates epistemic uncertainty from a screening classifier to adapt downstream predictions. Unlike standard conformal methods that rely on auxiliary residual regression, we leverage epistemic uncertainty from a primary classification task (identifying whether a medication change is needed) to dynamically scale the prediction intervals of a secondary regression task (predicting how much change). By mapping Venn-Abers multi-probabilistic uncertainty directly to non-conformity scores, our framework achieves continuous risk adaptation. We demonstrate that this ``cascade effect'' produces highly efficient intervals for confident patients (38.9% narrower than standard conformal baselines) while automatically expanding intervals to ensure robust coverage for uncertain cases, bridging the gap between discrete clinical decision-making and continuous dose forecasting in PD.
Read the breakdown →StudyPreprintWikiModerate
Controlling False Discovery in Arbitrarily Structured Hypothesis Spaces via Reproducing Kernels
Binyamin Perets, Shie Mannor · 2026
Large-scale hypothesis testing is central to modern science, where controlling the False Discovery Rate (FDR) has become the standard approach to managing false positives across many simultaneous tests. Hypotheses rarely exist in isolation; they often exhibit structure through proximity, connectivity, or hierarchy. This structure represents both a challenge and an opportunity: while classical methods treat these dependencies as obstacles requiring conservative correction, leveraging them can substantially increase discovery power. Here, we reframe structured FDR control as a regularized learning problem. By optimizing within a suitable Reproducing Kernel Hilbert Space (RKHS), we introduce a framework that unifies continuous domains, graphs, and hierarchies under a single algorithm through kernel choice alone. This formulation enables smooth solutions in place of the piecewise-constant fits of prior methods, principled likelihood-based hyperparameter selection rather than heuristic tuning, and inference at unobserved locations which in turn supports sample-efficient experimental design. Building on this estimator, we provide two decision rules which we prove to control the FDR. We validate our method on two sources: spatial locations derived from high-dimensional real-world datasets, and a differential gene expression task utilizing protein-protein interaction graphs.
Read the breakdown →StudyPreprintWikiModerate
Random spanning tree Markov random field priors for Bayesian inverse problems in imaging
Jasper Marijn Everink · 2026
Markov random fields are common prior distributions used in Bayesian inverse imaging problems. In particular, difference priors assign probability distributions to differences between neighbouring pixels, such as Gaussian, Laplace, or Cauchy distributions. Depending on the chosen difference distribution, these priors have smoothing or edge-preserving properties. In this work, we propose a hyperprior on the connectivity graph of the pixel grid in the form of a random spanning tree, i.e., a random connected graph with the minimal number of edges, thereby coupling continuous and discrete random variables in the prior. By using random spanning trees, only a sparse random subset of edges is regularized, which helps preserve edges in the image with reduced contrast loss compared to standard difference-based Markov random fields. We discuss how fractal-like interfaces arise in high-resolution prior samples due to the random-tree connectivity. Finally, we propose a Gibbs sampler that alternates between the discrete tree updates and continuous pixel updates to efficiently explore the posterior distribution. We apply the method to various standard test image restoration problems, including denoising, deblurring, and inpainting, to study the impact of the proposed prior in comparison with existing Markov random fields.
Read the breakdown →StudyPreprintWikiModerate
Stable Causal Discovery via Directed Acyclic Graph Aggregation
Yunan Wu, Yue Wang, Chunlin Li +1 more · 2026
Directed Acyclic Graphs (DAGs) are central to uncovering causal structure in complex systems, yet learning a single DAG from data is often challenging: model uncertainty, finite samples, and a combinatorially large search space frequently yield unstable estimates. We propose DAGgr, a model averaging framework that aggregates multiple candidate DAGs into a single stable representation. Candidate graphs are weighted by their out-of-sample predictive likelihood across repeated data splits, and a thresholding rule on the resulting edge-importance scores guarantees that the aggregated graph is itself acyclic. We establish a finite-sample risk bound, prove that the procedure preserves acyclicity, and show that edge selection is consistent under mild conditions on the weights. Simulations across random, hub, and chain structures, together with an analysis of the Sachs et al. (2005) protein-signaling network, show that DAGgr matches or exceeds the best individual candidate while consistently outperforming bootstrap-aggregation baselines across structural recovery metrics.
Read the breakdown →StudyPreprintWikiModerate
A Unified Framework for Structure-Aware Clustering and Heterogeneous Causal Graph Learning
Honglin Du, Muxuan Liang, Xiang Zhong · 2026 · 1 citations
In complex multivariate systems, interactions among variables are defined by dependency structures, often encoded as directed acyclic graphs ($\text{DAGs}$). However, dependency structures can vary across subjects, and ignoring this structural heterogeneity introduces bias and obscures subpopulation-specific dependencies. To address this, we propose Directed Acyclic Graph-based Dependency Clustering via Alternating Direction Method of Multipliers (DAG-DC-ADMM), a unified framework built upon Structural Equation Modeling (SEM) that jointly learns cluster assignments and cluster-specific dependency structures. We encode acyclicity via a smooth constraint and integrate a groupwise truncated Lasso fusion penalty (gTLP) to cluster subjects based on their structural similarity. This yields a nonconvex optimization problem that incorporates sparsity, acyclicity, and structural consensus constraints. We address the nonconvexity by using the augmented Lagrangian method and solve it with an adapted version of the Alternating Direction Method of Multipliers (ADMM) for difference-of-convex programs. For certain graph structures, such as upper triangular adjacency matrices, our algorithm is guaranteed to converge to a Karush-Kuhn-Tucker (KKT) point. Experiments demonstrate that our method recovers cluster-specific causal dependency structures with a high true positive rate and a low false discovery rate. This capability enables the robust discovery of heterogeneous dependencies across subjects where the subpopulation label is unknown.
Read the breakdown →StudyPreprintModerate
Progression to the mean: A practical Bayesian workflow for the development and deployment of clinical prediction models
Mohsen Sadatsafavi, Richard D. Riley · 2026
Clinical prediction models provide a prediction (e.g., estimated risk) for each individual, typically expressed as a point estimate derived from a deterministic function such as a logistic regression equation. Such 'plug-in' predictions hide inherent uncertainty. In contrast, Bayesian methods offer a coherent mechanism for uncertainty quantification based on an individual-specific posterior distribution of risk. However, Bayesian prediction models are underutilised, due to perceived subjectivity, computational cost, and implementation complexity. To address this, we propose a pragmatic Bayesian pipeline for producing and deploying prediction models. The main components are (i) shrinkage priors leading to posterior distributions of regression coefficients based on a Laplace/normal approximation, which avoids Monte Carlo sampling; and (ii) using an individual's posterior mean for decision-making, justified by an expected utility perspective. For (i), we suggest priors with complementary features (simplicity, user input, automatic shrinkage). For (ii), we suggest exact and approximate methods for computing the posterior mean, including quadrature, MacKay's approximation, and an adaptation of projection-predictive mapping that creates a simple logistic equation approximating the mean. Using examples and simulations, we demonstrate the Bayesian workflow often matches or exceeds predictive performance compared with plug-in predictions, while enabling uncertainty quantification with suitable coverage. In the majority of simulations, using the posterior mean predictions resulted in higher clinical utility, at times substantial, compared with plug-in predictions. In summary, a Bayesian approach to clinical prediction modelling and deployment is both pragmatic and clinically advantageous, so is highly recommended.
StudyPreprintWikiModerate
CausalGuard: Conformal Inference under Graph Uncertainty
Vikash Singh, Weicong Chen, Debargha Ganguly +12 more · 2026
Estimating treatment effects from observational data requires choosing an adjustment set, but valid adjustment depends on an unknown causal graph. Graph misspecification can cause under-coverage, while graph-agnostic conformal wrappers may regain nominal coverage only through large padding. We introduce CausalGuard, a structure-weighted conformal framework that calibrates after aggregating graph-conditional doubly robust pseudo-outcomes. Candidate DAGs are proposed from an LLM-derived edge prior, pruned by conditional-independence tests, and reweighted by Bayesian Information Criterion. A composite nonconformity score then calibrates the posterior-weighted pseudo-outcome. CausalGuard provides distribution-free finite-sample marginal coverage for this aggregated pseudo-outcome; under causal identification, overlap, conditional-mean nuisance stability, and concentration on target-aligned valid adjustment strategies, its conditional mean converges to the true Conditional Average Treatment Effect. Across five benchmarks, CausalGuard attains mean coverage above the nominal 90% level for the directly evaluable target and reduces width when graph-agnostic conformal baselines require large padding. Stress tests show that CausalGuard suppresses invalid collider adjustment and remains stable under misspecified priors when the retained candidate set is data-supported.
Read the breakdown →StudyPreprintWikiModerate
Causal Discovery in Structural VAR Models Under Equal Noise Variance
SeyedSina Seyedi HasanAbadi, Fahimeh Arab, Erfan Nozari +1 more · 2026 · 5 citations
Causal discovery from multivariate time series is challenging when causal effects may occur both across time and within the same sampling interval. This issue is especially important in applications such as neuroscience, where the sampling rate may be coarse relative to the underlying dynamics and contemporaneous effects need not form an acyclic graph. We study causal discovery in linear Gaussian structural VAR models under an equal noise variance assumption, meaning that the structural noise terms have a common variance. Unlike the DAG-based cross-sectional equal noise variance setting, the time-series setting considered here does not generally yield point identification of a unique causal graph. Instead, multiple structural VAR parameterizations can induce the same stationary observed process law. We introduce a notion of observational equivalence tailored to this setting and show that the corresponding equivalence class is characterized by orthogonal transformations of the structural equations together with a global positive scale. This characterization leads to an equivalence-aware model discrepancy, the observational alignment discrepancy, which compares structural models modulo transformations that preserve the observed law. Building on this theory, we propose ENVAR, a sparsity-based procedure that searches over the induced observational equivalence class for a sparse normalized structural representative. We evaluate the proposed methodology on synthetic structural VAR data and on an fMRI dataset.
Read the breakdown →StudyPreprintWikiModerate
Laplace Approximations for Mixed-Effects and Gaussian Process Quantile Regression
Andrea Nava, Fabio Sigrist · 2026
Laplace approximations are a standard tool for computationally efficient inference in latent Gaussian models, but they fail for quantile regression with the asymmetric Laplace likelihood because the observed Hessian vanishes almost everywhere. We show that this obstacle can be overcome without smoothing the likelihood: the relevant local curvature is given not by the observed Hessian, but by the Fisher information when the model is correctly specified and by the population curvature of the expected loss under misspecification. On this basis, we develop a Laplace approximation framework for quantile regression with mixed-effects and Gaussian process models. We propose practical curvature estimators, including the triangular kernel curvature (TKC) estimator, that yield approximations for posterior distributions and marginal likelihoods, and we establish their asymptotic validity. Empirically, the proposed methods are scalable and numerically stable, and for latent Gaussian models, they achieve accuracy comparable to or better than MCMC and variational competitors at substantially lower computational costs. More broadly, the framework clarifies how Laplace approximations can be justified for non-smooth generalized posteriors through local quadratic behavior of the expected loss.
Read the breakdown →StudyPreprintModerate
Statistical Limits and Efficient Algorithms for Differentially Private Federated Learning
Arnab Auddy, Xiangni Peng, Subhadeep Paul · 2026
Federated Learning is a leading framework for training ML and AI models collaboratively across numerous user devices or databases. We study the trade-offs among estimation accuracy, privacy constraints, and communication cost for differentially private (DP) federated M estimation. The two standard methods in the literature are FedAvg, which may suffer from high federation bias, and FedSGD, which can incur high communication cost. Aimed at improving accuracy at a reduced communication cost, we propose FedHybrid, which uses FedSGD starting with an improved initialization by the FedAvg estimator. We propose FedNewton, which averages local Newton iterations to reduce bias in FedAvg, achieving an estimation accuracy comparable to FedSGD with much fewer communication rounds when the number of clients grows sufficiently slowly. We establish finite sample upper bounds on the mean-squared error rates of the DP versions of these estimators as functions of the number of clients, local sample sizes, privacy budget, and number of iterations. We further derive a minimax lower bound on the MSE of any iterative private federated procedure that provides a benchmark to assess the optimality gap of these methods. We numerically evaluate our methods for training a logistic regression and a neural network on the computer vision datasets MNIST and CIFAR-10.
StudyPreprintWikiModerate
TCARD: Nearly Balanced Two-Level Designs with Treatment Cardinality Constraints with an Application to LLM Prompt Engineering
Kexin Xie, Ryan Lekivetz, Xinwei Deng · 2026
Modern experimental designs often face the so-called treatment cardinality constraint, which is the constraint on the number of included factors in each treatment. Experiments with such constraints are commonly encountered in engineering simulation, AI system tuning, and large-scale system verification. This calls for the development of adequate designs to enable statistical efficiency for modeling and analysis within feasible constraints. In this work, we study two-level designs under this $k$-treatment cardinality constraint (TCARD), where the design matrix $\mathbf{X} \in \{0,1\}^{n \times p}$ has constant row sums equal to $k$. Although TCARDs are closely related to balanced incomplete block designs (BIBDs), exact BIBD structure is unavailable for many practical $(n,p,k)$ combinations. This leads to the notion of nearly balanced TCARDs, which we prove minimize the first two components of the generalized word-length pattern. We also show that good projection behavior in this setting is governed by two count-based regularities: balanced factor replications and uniform pairwise concurrences. Motivated by this characterization, we then propose the Balanced Concurrence Deviation ($Φ_{\mathrm{BCD}}$), a model-free objective that jointly penalizes replication imbalance and concurrence dispersion. We further show that this criterion is closely connected to classical optimality principles, including $(M,S)$-optimality, centered $\mathrm{UE}(s^2)$ criterion, and Bayesian $D$-optimality. To construct designs minimizing $Φ_{\mathrm{BCD}}$, we develop a coordinate-exchange (CE) algorithm with efficient incremental updates, together with a simulation-based procedure for calibrating the criterion weights to the intended downstream task. Numerical experiments confirm that the proposed method compares favorably with existing alternatives across a range of problem sizes and constraint strengths.
Read the breakdown →StudyPreprintModerate
Inference Functionals and Observation Operators for Distributional Statistical Models
R. Labouriau · 2026
This paper generalises inference functions (Godambe, 1960) to distributional statistical models, in which each probability measure is represented by a distribution--kernel pair $(T_θ, \varphi) \in \mathcal S'(\mathbb R) \times \mathcal S(\mathbb R)$. The generalisation is strategically motivated: the key properties of maximum likelihood estimation-consistency and asymptotic normality -derive not from maximising the likelihood but from the MLE being the root of a regular inference function. Extending inference functions to the distributional setting provides an optimality theory for models lacking classical densities or finite moments. The extension requires enlarging the notion of observation. We introduce observation operators $\mathcal O : \mathcal S'(\mathbb R) \to \mathcal Y$ mapping distributional models to an observation space, and define inference functionals as estimating equations composed with these operators. The framework encompasses classical point observations, interval-censored data, convolutional measurements, and transform-based statistics. We establish asymptotic theory (consistency, asymptotic normality, Godambe optimality) under mild conditions and derive a hierarchy of information bounds -- classical Fisher information dominates the information available through the observation operator, which in turn dominates the information captured by any inference functional -- via the Hájek--Le~Cam convolution theorem. The two gaps quantify distinct sources of information loss: the observation mechanism and the choice of inference functional. Examples include sinusoidal inference functions for heavy-tailed distributions, interval-censored location inference, elliptically contoured models, and nuisance parameters via the Bhapkar--Godambe projection.
StudyPreprintWikiModerate
Double/Debiased Machine Learning for Continuous Treatment Effects in Panel Data with Endogeneity
Peikai Wu, Kuan Sun, Zhiguo Xiao · 2026
We propose a double/debiased machine learning framework to estimate average derivative effects in nonparametric panel models with two-way fixed effects. It extends instrumental variable methods to panel settings, handles continuous treatments and various forms of endogeneity, and introduces a cross-fitting scheme to restore independence after eliminating time fixed effects. A penalized GMM debiasing term enables automatic debiased machine learning with endogeneity. Our estimators for contemporaneous, dynamic, and aggregated effects are consistent and asymptotically normal with a valid variance estimator. Simulations show reduced regularization bias and accurate confidence intervals. An application to ECLS-K data reveals rich dynamics in the effect of family SES on childhood BMI.
Read the breakdown →StudyPreprintWikiModerate
Goal-Oriented Lower-Tail Calibration of Gaussian Processes for Bayesian Optimization
Aurélien Pion, Emmanuel Vazquez · 2026
Bayesian optimization (BO) selects evaluation points for expensive black-box objectives using Gaussian process (GP) predictive distributions. Kernel choice and hyperparameter selection can lead to miscalibrated predictive distributions and an inappropriate exploration-exploitation trade-off. For minimization, sampling criteria such as expected improvement (EI) depend on the predictive distribution below the current best value, so lower-tail miscalibration directly affects the sampling decision. This article studies goal-oriented calibration of GP predictive distributions below a low threshold $t$ in the noiseless setting, for standard GP models with hyperparameters selected by maximum likelihood. A framework for predictive reliability below $t$ is introduced, based on two notions of spatial calibration: occurrence calibration over the design space and thresholded $μ$-calibration on sublevel sets of the form $\{x\in\mathbb{X}, f(x)\le t\}$. Building on this framework, we propose tcGP, a post-hoc method that calibrates GP predictive distributions below~$t$, and we show that the resulting EI-based global optimization algorithm remains dense in the design space. Experiments on standard benchmarks show improved lower-tail calibration and BO performance relative to standard GP models and globally calibrated GP models.
Read the breakdown →StudyPreprintModerate
A tool to determine the degrees of freedom in tree-structured varying coefficient models
Nikolai Spuck, Moritz Berger · 2026
The tree-structured varying coefficient (TSVC) model is a flexible approach for generalized regression, where the linear effects of the covariates are allowed to vary with the values of effect modifiers. Relevant effect modifiers and interactions are identified using recursive partitioning. In TSVC models, analogously to other semi- and nonparametric regression approaches, one needs to account for the cost of data-driven model building when deriving the model degrees of freedom (DoF). To address this issue, we develop an easy-to-apply formula to approximate the DoF of a TSVC model. This formula is employed for model selection based on the Bayesian information criterion (BIC) and compared to the naive solution, setting the DoF to the number of free model parameters, in a simulation study. To illustrate the proposed DoF method, TSVC models using BIC-based selection were fitted to data from the Survey of Health, Ageing, and Retirement in Europe. Results indicated that calculation of the DoF using the proposed formula resulted in more accurate selection results with improved predictive ability.
StudyPreprintWikiModerate
Sample Size Determination Under Selection Bias: Robust Tolerance Limits for Prevalent Cohort Data
James H. McVittie, Martin Lysy, Masoud Asgharian · 2026
Tolerance limits have received considerable attention in the statistical literature, with applications reaching far beyond their initial role in quality control. The well-known formula of Scheffé and Tukey (1944) establishes a simple, distribution-free relation between sample size and population coverage by two given order statistics and a given confidence level. A key requirement in applying this formula is the availability of an unbiased, representative sample from the population of interest. However, as it often happens in biological and medical applications, various logistical constraints may preclude the possibility of obtaining an unbiased sample. We derive extensions of this formula which accommodate a large class of biased sampling schemes including weight bias and censoring. The modified formulae are validated through a simulation study and compared to its unmodified counterpart. We illustrate the use of the modified formulae using the partially observed failure times for individuals with dementia using data collected from the Canadian Study of Health and Aging.
Read the breakdown →StudyPreprintWikiModerate
Block-Independent Likelihood Ratio Testing for High-Dimensional Mean Vectors with Applications to Matrix-Variate Data
Minsub Shin, Kwangok Seo, Sang Han Lee +1 more · 2026
Testing the equality of two high-dimensional mean vectors is a fundamental problem in multivariate analysis. While the classical Hotelling's $T^2$ test is optimal in low-dimensional settings, it fails when the dimension $p$ is comparable to or exceeds the sample size $n$. Several extensions, including the Diagonal Likelihood Ratio Test (DLRT), have been proposed under the working independence assumption among variables. However, such an assumption can lead to a substantial loss of power when correlations are present. In this paper, we propose a new test, the Block Independent Likelihood Ratio Test (BILT), which generalizes DLRT by relaxing the working independence assumption to a block independence assumption. We establish its asymptotic normality of the null distribution of the BILT statistic for 'increasing $p$ with small $n$' under mild regularity conditions. We further analyze the asymptotic power of BILT under a local alternatives. Extensive simulation studies show that BILT maintains Type I error control and achieves substantially higher power than DLRT across a wide range of covariance structures. An application to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset further demonstrates the application of BILT to testing mean differences between two matrix-variate populations.
Read the breakdown →StudyPreprintWikiModerate
From Volterra Series to Kunchenko Stochastic Polynomials: Half a Century of Non-Gaussian Estimation Methodology
Serhii Zabolotnii · 2026
This paper reconstructs the half-century evolution of the scientific school founded by Yuriy P. Kunchenko (1939--2006) as the development of a semiparametric methodology for non-Gaussian estimation. Starting with Kunchenko's 1972/1973 dissertation applying Volterra series to estimate parameters of random processes, the trajectory is followed through 2006--2026. Kunchenko stochastic polynomials are presented as a coherent family of moment-cumulant procedures: the polynomial maximization method (PMM) for parameter estimation, polynomial criteria for hypothesis testing, and decomposition in spaces with a generating element. The paper details the school's structure: a verified genealogy of 15 defended dissertations, collaborations in Poland, Slovakia, and Germany, and the R package EstemPMM. A recent 2026 paper on Volterra-based signal processing is analyzed, showing how Kunchenko's nonlinear formulation reappears in applied radio engineering. We build a formal bridge between finite Volterra models and generalized Kunchenko polynomials, while separating the MMSE/L2 criterion from PMM: the former is a covariance projection for kernel adaptation, whereas PMM is a parameter-dependent moment procedure. PMM efficiency claims are stated conditionally: gains require that moments exist, the centered correlant matrix is nondegenerate, and the variance reduction coefficient is below one. The concluding research program operationalizes the historical reconstruction into testable statistical and signal-processing tasks.
Read the breakdown →StudyPreprintWikiModerate
Chained Markov melding using divide and conquer sequential Monte Carlo
Yixuan Liu, Robert J. B. Goudie · 2026
Specifying a full Bayesian model that integrates multiple data sources can be challenging. One natural approach is to specify each individual model separately and join them afterwards. This is the approach adopted in Markov melding. However, when adjacent submodels share common quantities, as in chained Markov melding, posterior inference can be challenging for existing MCMC-based approaches. In this paper, we propose a new multi-stage sampler for chained Markov models involving an arbitrary number of submodels. The proposed sampler adopts a divide-and-conquer sequential Monte Carlo approach for the tree-structured model that fits naturally with the structure of chained Markov melding. The resulting multi-stage sampler provides a flexible alternative for sampling from complex joint models, as its separate sampling scheme for different submodels avoids the need for directly sampling from the full model. We demonstrate applications of the sampler through two examples. The first is a toy example involving 11 submodels of various types. The second example considers an ecologically integrated population model that combines multiple datasets to estimate immigration and reproduction rates.
Read the breakdown →StudyPreprintModerate
A robust nonparametric test for spatial isotropy in lattice data
Jana Gierse, Roland Fried · 2026
This paper proposes a robust test for assessing isotropy based on the variogram of spatial data on a two-dimensional regular grid. The test is based on the non-robust subsampling test for isotropy of Guan et al. (2004), which uses the idea of comparing variogram estimates in diff erent directions at the same distance. The robust test employs robust variogram esti- mators which are based on estimators of univariate or multivariate scatter and perform well in the presence of isolated or block outliers. Additionally, a diff erent resampling method, called block permutation, is proposed. Compared with the subsampling test, the block per- mutation test maintains the signifi cance level even for strong dependencies in the data and is robust to outliers. The methods are illustrated by an application to Landsat 8 satellite data, where outlier blocks may occur due to, for example, clouds.
StudyPreprintWikiModerate
How does limma-trend work? An empirical partially Bayes perspective
Sagnik Nandy, Wanyi Ling, Nikolaos Ignatiadis · 2026
In high-throughput biology, it is common to fit thousands of linear regressions -- one per gene, protein, or other unit -- with very few samples per unit. Limma-trend, one of the most widely used methods in this setting, improves power by shrinking variance estimates parametrically toward a fitted curve (the trend) relating variance to a unit-level summary (e.g., average intensity, peptide count), before computing p-values and applying the Benjamini-Hochberg procedure to control the false discovery rate (FDR). We study limma-trend through the lens of empirical partially Bayes inference, a paradigm in which a prior is posited and estimated for the nuisance parameters while parameters of interest remain fixed. From this perspective, limma-trend computes approximate partially Bayes p-values that condition on the residual sample variance and the unit-level summary. The same framework explains why MAnorm2, a popular variant for ChIP-seq, can sometimes fail to control FDR. We then derive a nonparametric generalization of limma-trend that estimates the residual variance prior using nonparametric maximum likelihood. Under dense signals, this procedure asymptotically controls the FDR -- even when the trend is misspecified or inconsistently estimated. To allow the full shape of the conditional variance distribution to depend on the unit-level summary, we develop a second procedure that learns it directly.
Read the breakdown →StudyPreprintModerate
Online Conformal Prediction for Non-Exchangeable Panel Data
Daohong Tu, Kay Giesecke · 2026
Panel data, in which multiple units are repeatedly observed over time, arise throughout science and engineering. Quantifying predictive uncertainty in such settings is challenging because conformal prediction, while distribution-free and model-agnostic, classically relies on exchangeability assumptions that fail under temporal dependence and unit heterogeneity. We propose a simple online conformal framework for non-exchangeable panel data. The method exploits a key feature of online panel prediction: when a forecast is required for one unit, contemporaneous outcomes from related units may already be observed and can serve as a calibration panel. At each round, prediction sets are formed using currently observed calibration units together with two adaptive quantities: history-based similarity weights that emphasize calibration units resembling the target, and an adaptive miscoverage level that is updated whenever target feedback is revealed. This two-state design yields a stepwise coverage bound and a long-run coverage guarantee. Empirically, across synthetic and real panel data sets, the method improves coverage on the worst-covered target units through adaptive interval-width allocation rather than uniform inflation. The two states are complementary: similarity weights protect coverage when target feedback is sparse, while the adaptive level further improves coverage as feedback accumulates.
StudyPreprintWikiModerate
Substantive-Model-Compatible Multiple Imputation for Cox Regression with a Diverging Number of Covariates
Zhilin Zhang, Yi Li · 2026
Modern biomedical survival studies with high-dimensional genomic and clinical predictors are challenged by missing covariates. Existing methods conduct inference through penalization and debiasing when the number of covariates diverges with sample size, but they are typically developed with fully observed covariates. Conversely, substantive-model-compatible multiple imputation methods, particularly substantive-model-compatible fully conditional specification (SMC-FCS), provide principled handling of missing covariates while preserving compatibility with the Cox model, yet current methodology and theory remain largely restricted to fixed-dimensional settings. To address these limitations, we propose a semiparametric multiple imputation framework for inference in Cox regression with missing covariates of a diverging dimension. Missing covariates are imputed through a high-dimensional SMC-FCS procedure driven by Cox-model likelihood contributions, with rejection sampling used to enforce substantive-model compatibility and ridge-regularized posterior draws used to stabilize the imputation models. The algorithm stabilizes the Cox estimator through an imputation-regularized optimization iteration and then generates multiply imputed datasets from a stabilized chain. Inference for low-dimensional linear functionals or contrasts, $c^\top β$, is obtained by combining debiased estimators and within-imputation variance estimates through Rubin's rules. We establish consistency and asymptotic normality of the resulting pooled estimator under a diverging-dimensional regime. Simulation studies demonstrate favorable finite-sample performance, and an application to the Boston Lung Cancer Survival Cohort illustrates the practical utility of the proposed method for high-dimensional survival studies with incomplete covariates.
Read the breakdown →StudyPreprintWikiModerate
Stable direct estimation for GPLSIAMs using P-splines with dynamically updated boundaries
Danilo V. Silva, Gilberto A. Paula · 2026
Generalized partially linear single-index additive models (GPLSIAMs) have been increasingly applied across diverse areas due to their versatility in integrating functional flexibility with parametric dimension reduction while maintaining interpretability. However, the estimation presents severe computational challenges. This paper introduces a novel stable method that uses the model matrix for each single-index effect, defined by its single-index coefficients, and the penalized complete Fisher information matrix to dynamically update the boundaries of the single-index covariates within a unified iterative framework. The derived model matrices enable the fast computation of the estimated effective degrees of freedom and pointwise confidence bands for the single-index effects. The smoothing parameter updates are integrated into the iterative process via the generalized Fellner-Schall method, which recycles the derived matrix decompositions, thereby providing an efficient approximation to the global penalized optimization problem. Simulation studies with moderate sample sizes under non-Gaussian distributions confirm the empirical consistency of the estimation across multiple scenarios. Notably, the proposed approach remains stable where state-of-the-art competitive methods fail to recover true single-index coefficients and nonlinear functions, and is 80.13 times faster than the usual two-step method in the most computationally intensive scenario. The modeling advantage is illustrated through an application to Capital Bike Sharing data, where we deal with a single-index interaction effect for each year, with distinct single-index coefficients, a complex structure that makes competitive methods inapplicable. The proposed method is implemented in R, with functions available for reproducibility and transparency in the comparisons.
Read the breakdown →StudyPreprintWikiModerate
Evaluation of the number of clusters in a data set using $p$-values from Multiple Tests of Hypotheses
Soumita Modak · 2026 · 4 citations
This paper proposes a novel, nonparametric, interpoint distance-based measure to investigate whether there exist any groups in a set of given data, and if so then, how many groups are prevailing in total. It is a cluster accuracy index useful for arbitrary-dimensional data set, in association with any clustering algorithm having the number of groups specified as a priori. We perform univariate, nonparametric, multiple statistical tests of hypotheses, where as many dependent tests as the sample size are carried out using the interpoint distances. They possess $p$-values to be combined to reach a decision, which is taken in a step-wise process for a possible number of clusters. It reduces the unnecessary computations compared with the other accuracy measures from the literature. Data study establishes the proposed index's efficiency and superiority.
Read the breakdown →StudyPreprintWikiModerate
The Bayesian Gaussian Process Latent Variable Model for Spatio-Temporal Stream Networks
Marno Basson, Tobias M. Louw, Theresa R. Smith · 2026
A variational inference-based framework for training a multi-output Gaussian process latent variable model, specifically tailored to the tails-up spatio-temporal stream network, is developed. Training, given a censored observational data set subject to missing values, proceeds by maximising a secondary variational lower bound on the model log marginal likelihood using gradient-based optimisation. Consequently, the theoretical development for a new family of tails-up spatio-temporal stream network models is introduced which rely on the sparse Gaussian process inducing variable framework, the Bayesian Gaussian process latent variable model, and local variational methods. These spatio-temporal models use stream distance instead of Euclidean distance and capture spatial and temporal dependencies using auto/cross-correlation and process convolution, respectively, which allows for the development of valid separable spatio-temporal stream network-based covariance functions. Results from the simulation-based case studies indicate that the proposed framework performs well when considering benchmark comparisons and several performance metrics.
Read the breakdown →StudyPreprintModerate
Self-Supervised Conformal Prediction with Equivariant Bootstrapping for Image Uncertainty Quantification
Henry J. Aldridge, Tobías I. Liaudat, Marcelo Pereyra +1 more · 2026
Inverse problems are ubiquitous in modern scientific studies and involve recovering an underlying signal from noisy observations often transformed by a measurement operator. These problems are frequently ill-posed, particularly in imaging, leading to multiple plausible solutions and considerable uncertainty in reconstructed images. In fields like the physical and biological sciences, accurate uncertainty quantification (UQ) is critical for trustworthy scientific analyses and confident diagnoses. Current UQ methods for imaging often fall short; they can be inaccurate, or require unavailable or difficult-to-acquire ground truth data for calibration, which can introduce hidden biases due to distribution shifts between calibration and observed data. We introduce a UQ approach that leverages equivariant bootstrapping to generate heuristic coverages by exploiting data symmetries. We then refine these coverages through a conformal prediction calibration step, while crucially employing a self-supervised approach to avoid the need for ground truth calibration data. We demonstrate this method with weak lensing mass-mapping, where we aim to reconstruct the convergence field from shear measurements of distant galaxies weakly-lensed by gravitational fields. Mass-mapping in particular benefits from the self-supervised approach, as simulating calibration data is expensive and relies on specific cosmological models that could introduce biases in downstream cosmological inference tasks.
StudyPreprintModerate
Changepoint Detection in Categorical Time Series with Application to Daily Total Cloud Cover in Canada
Mo Li, QiQi Lu, XiaoLan Wang · 2026
Changepoints are essential for homogenizing categorical time series and analyzing their trends and variations. The original total cloud cover in Canada was recorded hourly in tenths (or eighths), exhibiting inherent seasonality and serial correlation. Lu and Wang (2012) introduced an extended cumulative logit model to detect shifts in the annual frequencies of cloud cover conditions. While annual aggregation mitigates seasonality and serial correlation, it shortens the time series and may lead to overdispersion. This article introduces a marginalized transition model to detect a single changepoint in periodic and serially correlated categorical time series. The model captures serial dependence using a first-order Markov chain and enables category-specific changepoint specification. To enhance computational efficiency, we develop a new parameter estimation procedure for obtaining maximum likelihood estimates. A maximally selected likelihood ratio test statistic is then proposed to test for sudden changes in categorical time series, and the method is illustrated using daily total cloud cover observations recorded at 9 a.m. and 3 p.m. at Fort St. John Airport, British Columbia, Canada.
Meta-analysisPreprintModerate
Modelling pairs of Poissons and binomials with negative correlation
Nils Lid Hjort · 2026
Suppose $f_1(x)$ and $f_2(y)$ are given marginals for pairs $(x,y)$. I consider the construction $f_1(x)f_2(y)\{ 1+αh_1(x)h_2(y) \}$, where $h_1$ and $h_2$ are seen as bounded adjustment functions, normalised to have means zero under $f_1$ and $f_2$. This defines a bivariate distribution for $(X,Y)$ with the specified marginal densities $f_1$ and $f_2$, with an interval of permissible values of $α$, both positive and negative; in particular, independence corresponds to an innter point in the adjustments parameter region. Applications to bivariate Poisson distributions, allowing both positive and negative correlation, are discussed. As illustration I provide a more accurate and extended analysis of a Poisson pairs dataset, pertaining to competing seeds and plants, for $n=958$ plots of soil, earlier analysed in the well-cited paper Lakshminarayana, Pandit, Rao, Srinivasa (1999). The general apparatus is also shown to work for negatively correlated binomials. Those methods are illustrated in a meta-analysis framework for two-by-two tables across different studies, pertaining to the Audit-C screening questionnaire for alcohol use disorders, where again negative correlation is demonstrated, between $X$, the number of correct `yes', and $Y$, the number of correct `no'.
StudyPreprintWikiModerate
Clustering Craters on the Moon with Dysfunctional Families
Nathan Weed, Emily Castleton, Dave Osthus +2 more · 2026
Summaries of craters on terrestrial bodies, such as the number and size distribution, are essential for understanding the history of the Solar System. Identifying craters, however, has not been automated and thus relies on expert crater-counters marking static images. Robbins et al. (2014) (hereafter R14) showed that, contrary to previously held assumptions, there exists large variability across expert crater-counters' identified crater lists. How best to combine identified crater lists across multiple experts for the purposes of learning about the Solar System is an open and consequential question. R14 combined identified crater lists via clustering through a modification of the popular DBSCAN clustering method. Their approach did not, however, make use of all the constraining information available nor did it provide an estimate of clustering uncertainty. To address the shortcomings of the DBSCAN method, we present a novel clustering approach that can combine multiple lists of identified objects of interest from the same image. The key innovation is incorporating a dysfunctional family constraint into the Bayesian nonparametric clustering approach, the Chinese restaurant process (CRP), which naturally takes into account information about the crater identifier. The dysfunctional family Chinese restaurant process (DFCRP) provides an estimate of clustering uncertainty. In this work, we provide guidance on hyperparameter specification, present a Gibbs sampler, and perform a simulation study to compare the performance of the DFCRP to the CRP. Finally, we apply the DFCRP to the crater identification problem of R14, comparing results, and also demonstrate the types of analyses that can be performed with posterior draws of cluster assignments.
Read the breakdown →StudyPreprintWikiModerate
Application of Propensity Score Models and Causal Estimators in Observational Studies under Model Misspecification
Apu Chandra Das, Sakib Salam, Md Robiul Islam Talukder +3 more · 2026
Propensity score (PS) methods are widely used in observational studies to reduce confounding and estimate causal treatment effects. However, the validity of PS-based causal estimators depends heavily on correct model specification, and model misspecification may lead to substantial bias and instability. In this study, we systematically evaluate the performance of commonly used causal estimators, including response surface modeling (RSM), inverse probability weighting (IPW), and augmented inverse probability weighting (AIPW), under varying levels of PS and outcome model misspecification. We compare classical logistic regression with several machine learning approaches for PS estimation, including random forests (RF), support vector machines (SVM), and linear discriminant analysis (LDA). Extensive simulation studies were conducted under multiple scenarios defined by combinations of correctly specified and misspecified PS and outcome models, varying sample sizes, and different covariate correlation structures. Estimator performance was assessed using bias, absolute bias, root mean squared error, empirical standard error, and confidence interval width. Results demonstrate that AIPW consistently provides robust and stable estimates across most scenarios due to its doubly robust property, whereas IPW is highly sensitive to PS misspecification and unstable PS estimates produced by flexible machine learning methods. RSM performs well only when the outcome model is correctly specified. Real-world applications using the ACTG175 clinical trial and the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset further illustrate the practical implications of estimator choice and PS modeling strategy. Overall, our findings highlight the importance of integrating flexible machine learning approaches within doubly robust frameworks to improve causal effect estimation in observational studies.
Read the breakdown →StudyPreprintModerate
Uncertainty-Aware Ideal Point Estimation via Variational EM
Kwangok Seo, Youngjo Lee, Jong Hee Park +2 more · 2026
Roll-call data analysis aims to estimate legislators' ideal points and quantify the associated uncertainty. Existing approaches either rely on Bayesian methods implemented via Markov chain Monte Carlo sampling or focus primarily on point estimation, with uncertainty typically assessed through resampling procedures such as the bootstrap. Consequently, the computational burden of these approaches can become substantial when applied to large roll-call datasets. To address this challenge, we propose a computationally efficient likelihood method for estimating ideal points and their standard errors. Leveraging the Pólya--Gamma identity, we develop a variational expectation--maximization algorithm for estimating ideal points and introduce a variational Louis' method to approximate the observed Fisher information for standard error estimation. Numerical studies and applications to U.S. congressional roll-call data demonstrate that the proposed method produces accurate ideal point estimates and reliable standard errors while being substantially more computationally efficient than existing approaches.
StudyPreprintWikiModerate
A new class of functional conditional autoregressive models
Sooran Kim · 2026 · 1 citations
We introduce a new class of conditional autoregressive models for spatially dependent functional data, formulated through conditional means given neighboring functional observations and characterized by a covariance operator and a spatial dependence parameter. Our estimation strategy consists of three components: (i) estimating the covariance operator using conditionally centered data, (ii) estimating the spatial dependence parameter by maximizing the likelihood of projected observations, and (iii) applying a novel profile-based approach to obtain the final estimators. Under an expanding lattice framework, we establish two key theoretical results. First, we establish the consistency of the proposed covariance estimator, which is not attainable using naive methods based on marginally centered data. Second, we prove that the spatial dependence parameter estimator is superconsistent and asymptotically normal, where the latter property enables statistical inference for spatial dependence in functional data -- a contribution that is novel in the existing literature. Numerical studies support the theoretical results and demonstrate the computational efficiency of our method. Finally, we illustrate its practical utility by analyzing weekly PM$_{2.5}$ concentration trajectories in 2019 across counties in the Midwestern United States.
Read the breakdown →StudyPreprintWikiModerate
Component over Composite: Mitigating Type I Error Inflation when Imputing "Days Alive and at Home"
Mia S. Tackney, Sarah Dawson, Letao Yuan +2 more · 2026
Background: Days Alive and at Home (DAH) over a pre-defined follow-up period is a novel post-intervention composite outcome that combines data from at least three components: (i) initial length of hospital stay, (ii) length of total readmissions or other post-discharge care and (iii) mortality. Missing values bring unique challenges to the analysis of trials with the DAH outcome as the three components may have different rates of missingness caused by distinct missing data mechanisms. Current approaches define DAH as missing if any of the components are missing, and proceed with complete cases or Multiple Imputation (MI) of the composite. Methods: Through a simulation study motivated by the NOTACS trial, we compare several methods of handling missing data, including complete case analysis, MI of the composite, and MI of the components when the primary analysis is a Mann-Whitney-Wilcoxon test. Results: MI on the component level has good properties in terms of type I error control and power. We caution against the use of MI on the composite level with Predictive Mean Matching, which can lead to type I error inflation. Conclusions: Given the complex distributional characteristics of DAH, naive approaches such as defining missingness on the composite level and directly imputing the composite with Predictive Mean Matching, can lead to type I error inflation. Imputing on the component level is recommended, suggested future work included imputation approaches that are compatible with more complex definitions of DAH, as well as recommendations for sensitivity analyses to the Missing at Random assumption.
Read the breakdown →StudyPreprintWikiModerate
Testing for Serial Independence via Auto Hilbert-Schmidt Independence Criterion
Muyi Li, Yuqing Xu, Zhou Zhou · 2026
We develop a Hilbert--Schmidt independence criterion (HSIC)-based framework for testing serial independence in strictly stationary time series. The proposed auto Hilbert--Schmidt independence criterion (AutoHSIC) measures dependence between an observation and its lagged counterpart, providing a kernel-based approach to detecting nonlinear serial dependence. The empirical AutoHSIC statistic is a lagged U-statistic constructed from overlapping observations, and hence inherits temporal dependence even under the i.i.d. null. Its asymptotic analysis therefore differs from standard i.i.d. HSIC theory and must account for degeneracy under the null. We establish the limiting behaviour of the resulting single-lag and portmanteau tests under the null and under fixed alternatives. Since the limiting null distribution is non-pivotal, we develop a wild bootstrap procedure for critical value approximation and prove its asymptotic validity. The framework is further extended to residual-based model diagnostics, where parameter estimation affects the null distribution. Simulations and empirical applications illustrate its ability to detect nonlinear serial dependence in multivariate, functional and matrix time series.
Read the breakdown →StudyPreprintWikiModerate
Compensator-Based Inference for Signal Detection Under Unknown Background
Aritra Banerjee, Sara Algeri · 2026
The problem of detecting new signals in the presence of an unknown background is ubiquitous in scientific discoveries and is especially prominent in the physical sciences. Most solutions proposed thus far to address the problem focus on estimating the background distribution and using that estimate to infer the signal. By studying the geometry of the problem, this article demonstrates that estimating the background distribution is somewhat unnecessary for inferring the signal intensity. Instead, it suffices to estimate a single parameter, referred to as the compensator, to account for the incomplete knowledge on the background, substantially simplifying the problem's complexity and enabling proper uncertainty propagation. Such a compensator is shown to govern the conservativeness of the inference, both in the proposed setup and in likelihood-based approaches.
Read the breakdown →ObservationalPreprintWikiModerate
Targeted maximum likelihood estimation of vaccine effectiveness and immune correlates in test-negative design studies with missing data
Leah I. B. Andrews, Lars van der Laan, Peter B. Gilbert · 2026
The test-negative design (TND) is a resource-efficient observational study design that can assess vaccine effectiveness and exposure-proximal immune correlates of disease. The TND enrolls symptomatic individuals seeking diagnostic testing and compares case status by an exposure variable, such as vaccination status or immune marker level, that is measured at testing. While the TND reduces confounding by healthcare-seeking behavior, other sources of confounding may remain. TND studies may also have missing data in the exposure variable due to incomplete records or two-phase sampling designs. We present a targeted maximum likelihood estimation approach involving a semiparametric logistic regression model that targets a causal conditional risk ratio of symptomatic disease in the healthcare-seeking population. Under causal and missing at random assumptions, our method produces an efficient, asymptotically linear estimator that provides flexible, data-driven confounding control and valid causal inference when analyzing TND studies with missing exposure variable data. We evaluate our method's finite sample properties using plasmode simulations of a two-phase TND immune correlates study. We also apply our method to assess COVID-19 vaccine effectiveness and antibody marker correlates of COVID-19 from TND study cohorts derived from the Moderna Coronavirus Efficacy phase 3 trial.
Read the breakdown →StudyPreprintWikiModerate
Quantile-Based Effectiveness Persistence Function: A Tail-Focused Metric with Theory, Estimation, and Application to Biosimilar Evaluation
Sankaran P. G., Prasanth V. P., Midhu N. N · 2026
In clinical studies, persistence, which measures the duration of time a patient continues to take a prescribed medication without discontinuation, is increasingly recognized as a critical indicator of adherence to medication. Adherence encompasses not only whether a patient takes their medication as prescribed but also the consistency and duration with which they do so. Among the various metrics used to evaluate adherence, persistence stands out as a particularly robust measure because it provides a temporal dimension, reflecting the sustained commitment of patients to their therapeutic regimens. This focus on persistence offers unique insights into adherence-related quality and performance, shedding light on the challenges and opportunities to optimize long-term medication use. The comparison of upper-tail clinical performance, which measures the extent to which very large responses persist among top responders, is often more decisive in therapy evaluation than conventional summaries. In this paper, we introduce the quantile-based effectiveness persistence function defined as the ratio between the tail mean and the quantile function. The notion parallels expected shortfall in risk theory and is tailored to detect clinically meaningful deviations in the upper tail. We establish key properties and show that the function is equivalent to the first L-moment of the scaled tail, yielding robust inference tools. We derive a simple nonparametric estimator of the function and develop a bootstrap-calibrated two-sample (upper-tail) equivalence test. Simulation studies and real-data analysis illustrate that the proposed measures captures clinically relevant tail persistence that complements median and mean-based summaries.
Read the breakdown →