Genes clustering selection to survival prediction in breast cancer patients


  • Khennedy Bacule dos Santos Instituto de Ciências Matemáticas e de Computação (ICMC), USP - São Carlos
  • Israel Tojal da Silva A.C.Camargo Cancer Center
  • Mariana Cúri Instituto de Ciências Matemáticas e de Computação (ICMC), USP - São Carlos



Machine learning, Breast cancer, Genes expression


The risk stratification based on molecular data for predicting cancer progression or outcome is an important undertaking for supporting clinical decision making in oncology. In this work, we use Cox model and K-means to define a prognostic gene expression-based signature. Our approach reaches a better C-index (0.8341) and outperforms the Cox model by using clinical data alone (0.6348). Overall, this shows that the genetic signature found is related to the evolution of the patient's clinical condition, detecting molecular features related to prognosis in breast cancer.

