On the Aggregation of Published Prognostic Scores for Causal Inference in Observational Studies
Nguyen TL, Collins GS, Pellegrini F, Moons KGM, Debray TPA
As real world evidence on drug efficacy involves non-randomised studies, statistical methods adjusting for confounding are needed. In this context, prognostic score (PGS) analysis has recently been proposed as a method for causal inference. It aims to restore balance across the different treatment groups by identifying subjects with a similar prognosis for a given reference exposure ('control'). This requires the development of a multivariable prognostic model in the control arm of the study sample, which is then extrapolated to the different treatment arms. Unfortunately, large cohorts for developing prognostic models are not always available. Prognostic models are therefore subject to a dilemma between overfitting and parsimony; the latter being prone to a violation of the assumption of no unmeasured confounders when important covariates are ignored. Although it is possible to limit overfitting by using penalization strategies, an alternative approach is to adopt evidence synthesis. Aggregating previously published prognostic models may improve the generalizability of PGS, while taking account of a large set of covariates - even when limited individual participant data are available. In this article, we extend a method for prediction model aggregation to PGS analysis in non- randomised studies. We conduct extensive simulations to assess the validity of model aggregation, compared with other methods of PGS analysis for estimating marginal treatment effects. We show that aggregating existing PGS into a 'meta-score' is robust to misspecification, even when elementary scores wrongfully omit confounders or focus on different outcomes. We illustrate our methods in a setting of treatments for asthma.
This article is distributed under the terms of the Creative Commons Attribution 4.0 Non Commercial International License, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial.
Cite as: Nguyen TL, Collins GS, Pellegrini F, Moons KGM, Debray TPA. On the Aggregation of Published Prognostic Scores for Causal Inference in Observational Studies. Stat Med 2020, volume 0. DOI: 10.1002/sim.8489.