The shape of cancer relapse: Topological data analysis predicts recurrence in paediatric acute lymphoblastic leukaemia
S. Chulián, B. J. Stolz, A. Martínez-Rubio, C. Blázquez-Goñi, J. F. Rodríguez, T. Caballero, A. Molinos, M. Ramírez-Orellana, A. Castillo, J. L. Fuster, A. Minguela, M. V. Martínez, M. Rosa, V. M. Pérez-García, H. Byrne
PLOS Computational Biology 19(8) e1011329 (2023)
Abstract
Acute Lymphoblastic Leukaemia (ALL) is the most frequent paediatric cancer. Modern therapies have improved survival rates, but approximately 15-20 % of patients relapse. At present, patients’ risk of relapse are assessed by projecting high-dimensional flow cytometry data onto a subset of biomarkers and manually estimating the shape of this reduced data. Here, we apply methods from topological data analysis (TDA), which quantify shape in data via features such as connected components and loops, to pre-treatment ALL datasets with known outcomes. We combine these fully unsupervised analyses with machine learning to identify features in the pre-treatment data that are prognostic for risk of relapse. We find significant topological differences be- tween relapsing and non-relapsing patients and confirm the predictive power of CD10, CD20, CD38, and CD45. Further, we are able to use the TDA descriptors to predict patients who relapsed. We propose three prognostic pipelines that readily extend to other haematological malignancies.