Multimodal machine learning models enhance outcome prediction in intrahepatic cholangiocarcinoma
Abstract
Background
Intrahepatic cholangiocarcinoma (iCCA) is a highly heterogeneous malignancy with limited treatment options, particularly for unresectable cases. Despite the availability of multi-modal data, it remains challenging to effectively integrate these diverse data types to inform treatment decisions.
Methods
Nested cross-validation was applied to two cohorts of iCCA patients: resected (N = 75) and unresected (N = 98). Multimodal data, including clinical, histological, radiological, and targeted sequencing, were utilized to develop and evaluate machine learning models for predicting overall survival (OS), and progression/recurrence-free survival (PFS/RFS). The most predictive features were identified through multivariate analysis and Shapley values.
Results
The machine learning models demonstrated a good ability to predict OS and PFS/RFS on held-out patients within each cohort, achieving average concordance indices of up to 0.70 ± 0.13. Inter-cohort validation showed that models were able to generalize, with concordance indices reaching 0.61 (95% CI: 0.53–0.68). In resected patients, CA-19-9 (Shapley = 0.27), ARID1A alteration (p = 0.057), tumor sphericity (Shapley = 0.09) and histological tumor tiles (p < 0.001) were the top predictors of worse OS. In unresected patients, male gender (Shapley = 0.23) and KRAS (p = 0.012) negatively correlated with OS, whereas gray non-uniformity (Shapley = 0.11) was associated with improved outcomes for both cohorts.
Conclusions
Machine learning models utilizing multi-modal data can effectively predict survival and recurrence in iCCA patients. These predictive models, along with their interpretable features, hold potential for enhancing our understanding of the disease and guiding treatment selection.