Home | Publications | BZB+26

A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data

MCML Authors

Lukas Burk

* Former Member

→ Group Bernd Bischl
Statistical Learning and Data Science

Bernd Bischl

Prof. Dr.

Director

Statistical Learning and Data Science

Andreas Bender

Dr.

→ Group Bernd Bischl
Statistical Learning and Data Science

Abstract

Motivation: This work presents the first large-scale neutral benchmark experiment focused on single-event, right-censored, low-dimensional survival data. Benchmark experiments are essential in methodological research to scientifically compare new and existing model classes through proper empirical evaluation. Existing benchmarks in the survival literature are smaller in scale regarding the number of used datasets and extent of empirical evaluation. They often lack appropriate tuning or evaluation procedures, while other comparison studies focus on qualitative reviews rather than quantitative comparisons. This comprehensive study aims to fill the gap by neutrally evaluating a broad range of methods and providing generalizable guidelines for practitioners.<br>Results: We benchmark 21 models, ranging from classical statistical approaches to many common machine learning methods, on 34 publicly available datasets. The benchmark tunes models using both a discrimination measure (Harrell’s C-index) and a scoring rule (Integrated Survival Brier Score), and evaluates them across six metrics covering discrimination, calibration, and overall predictive performance. Despite superior average ranks in overall predictive performance from individual learners like oblique random survival forests and likelihood-based boosting, and better discrimination rankings from multiple boosting- and tree-based methods as well as parametric survival models, no method statistically significantly outperforms the commonly used Cox proportional hazards model for either tuning measure. We conclude that while the Cox Proportional Hazards model remains a robust default for low-dimensional, right-censored survival data, more flexible methods may be preferable for specific dataset characteristics.<br>Availability and Implementation: All code, data, and results are publicly available on GitHub https://github.com/slds-lmu/paper_2023_survival_benchmark and archived on Zenodo https://doi.org/10.5281/zenodo.19075310.

article BZB+26

Bioinformatics

btag186. Apr. 2026.

Authors

L. Burk • J. Zobolas • B. Bischl • A. Bender • M. N. Wright • R. Sonabend

Links

DOI GitHub

In Collaboration

OSPO Now

Research Area

A1 | Statistical Foundations & Explainability

BibTeXKey: BZB+26

#p-bischl