Debray TP, Koffijberg H, Nieboer D, Vergouwe Y, Steyerberg EW, Moons KG
Published clinical prediction models are often ignored during the development of novel prediction models despite similarities in populations and intended usage. The plethora of prediction models that arise from this practice may still perform poorly when applied in other populations. Incorporating prior evidence might improve the accuracy of prediction models and make them potentially better generalizable. Unfortunately, aggregation of prediction models is not straightforward, and methods to combine differently specified models are currently lacking. We propose two approaches for aggregating previously published prediction models when a validation dataset is available: model averaging and stacked regressions. These approaches yield user-friendly stand-alone models that are adjusted for the new validation data. Both approaches rely on weighting to account for model performance and between-study heterogeneity but adopt a different rationale (averaging versus combination) to combine the models. We illustrate their implementation in a clinical example and compare them with established methods for prediction modeling in a series of simulation studies. Results from the clinical datasets and simulation studies demonstrate that aggregation yields prediction models with better discrimination and calibration in a vast majority of scenarios, and results in equivalent performance (compared to developing a novel model from scratch) when validation datasets are relatively large. In conclusion, model aggregation is a promising strategy when several prediction models are available from the literature and a validation dataset is at hand. The aggregation methods do not require existing models to have similar predictors and can be applied when relatively few data are at hand.