Multiple instance learning with spatial transcriptomics for interpretable patient-level predictions: application in glioblastoma

Abstract

Accurate prediction of patient outcomes remains a major challenge in oncology. While recent machine learning (ML) approaches often rely on bulk omics lacking spatial resolution or histology-based multiple instance learning (MIL), spatial transcriptomics (SpT) provides a unique opportunity to capture both molecular content and tissue architecture. However, no generalizable ML framework has yet been established to exploit SpT for patient-level outcomes.

We present SpaMIL, a flexible and interpretable MIL framework designed for SpT, with a distillation strategy that enables deployment for hematoxylin and eosin (H&E) slides alone. We evaluate the framework by predicting survival from glioblastoma (GBM) patients, a clinically compelling setting given its aggressiveness with a median survival of only 15 months and the lack of prognostic clinical variables. We analyzed 76 GBM cases from the MOSAIC dataset: 43 with matched SpT, H&E, single-nucleus RNA-seq (scRNA-seq), bulk RNA-seq, and clinical variables, and 33 with H&E for external validation. We developed two main architectures: abMIL, tailored to SpT’s spatial molecular structure, and MabMIL, which distills SpT-derived representations into H&E. Model interpretability was achieved through a Shapley-based framework linking prognostic predictions to cell-type compositions via SpT deconvolution.

In benchmarking across the five GBM MOSAIC modalities, SpT-based abMIL achieved unprecedented prognostic accuracy (median C-index: 0.72, standard deviation: 0.04), outperforming all other modalities, including established clinical predictors. PCA and deconvolution-based SpT representations surpassed recent foundation models, suggesting the need for further research on SpT foundation models. Our interpretability analysis highlighted malignant and non-malignant cell subpopulations associated with favorable or poor prognosis, consistent with recent reports.

Finally, MabMIL maintained strong performance while enabling H&E-only deployment, with improved condorance index over H&E-only baselines in both internal (0.59 vs. 0.57) and external (0.62 vs. 0.55) cohorts.

View publication