Machine Learning Algorithms for Prediction of Breast Cancer Survival

Authors

  • Pablo Deoclecia dos Santos Universidade Federal do ABC
  • Erika Yahata Universidade Federal do ABC
  • Talita Santos Piheiro Universidade Federal do ABC
  • Fellipe Soares de Oliveira Universidade Federal do ABC
  • Priscyla Waleska Simões Universidade Federal do ABC

DOI:

https://doi.org/10.59681/2175-4411.v15.iEspecial.2023.1091

Keywords:

Survival Analysis, Machine Learning, Breast cancer

Abstract

Objective: This paper aims to show a comparative analysis of Machine Learning algorithms applied to Breast Cancer Survival prediction. Methods: Descriptive study that considered data from 1,570 patients with stage I-III breast cancer. The Synthetic Minority Oversampling Technique was applied due to an imbalance in the dataset. The Naive Bayes, Random Forest, Multilayer Perceptron and AdaBoost algorithms were considered in the study, and cross-validation as a learning strategy. Results: The model developed from the Random Forest algorithm showed greater accuracy (96.2%; 95%CI: 95.5%-96.9%) and specificity (97.4%; 95%CI: 96.6%-98.2% ); and the model developed from AdaBoost, greater sensitivity (95.3%; 95%CI: 94.3%-96.4%). Conclusion: Thus, among the models presented in our study, the one developed from the Random Forest algorithm presented, in general, the best evaluation measures in the prediction of breast cancer survival.

Author Biographies

Pablo Deoclecia dos Santos, Universidade Federal do ABC

Programa de Pós-Graduação em Engenharia Biomédica, Centro de Engenharia, Modelagem e Ciências Sociais Aplicadas-CECS, Universidade Federal do ABC-UFABC, São Bernardo do Campo (SP), Brasil.

Erika Yahata, Universidade Federal do ABC

Programa de Pós-Graduação em Engenharia da Informação, Centro de Engenharia, Modelagem e Ciências Sociais Aplicadas-CECS, Universidade Federal do ABC-UFABC, Santo André (SP), Brasil. Curso de Engenharia Biomédica, Centro de Engenharia, Modelagem e Ciências Sociais Aplicadas-CECS, Universidade Federal do ABC-UFABC, São Bernardo do Campo (SP), Brasil.

Talita Santos Piheiro, Universidade Federal do ABC

Programa de Pós-Graduação em Engenharia Biomédica, Centro de Engenharia, Modelagem e Ciências Sociais Aplicadas-CECS, Universidade Federal do ABC-UFABC, São Bernardo do Campo (SP), Brasil.

Fellipe Soares de Oliveira, Universidade Federal do ABC

Programa de Pós-Graduação em Engenharia Biomédica, Centro de Engenharia, Modelagem e Ciências Sociais Aplicadas-CECS, Universidade Federal do ABC-UFABC, São Bernardo do Campo (SP), Brasil.

Priscyla Waleska Simões, Universidade Federal do ABC

Programa de Pós-Graduação em Engenharia Biomédica, Centro de Engenharia, Modelagem e Ciências Sociais Aplicadas-CECS, Universidade Federal do ABC-UFABC, São Bernardo do Campo (SP), Brasil. Programa de Pós-Graduação em Engenharia da Informação, Centro de Engenharia, Modelagem e Ciências Sociais Aplicadas-CECS, Universidade Federal do ABC-UFABC, Santo André (SP), Brasil. Curso de Engenharia Biomédica, Centro de Engenharia, Modelagem e Ciências Sociais Aplicadas-CECS, Universidade Federal do ABC-UFABC, São Bernardo do Campo (SP), Brasil

References

Hassan MA, Ates-Alagoz Z. Cyclin-Dependent Kinase 4/6 Inhibitors Against Breast Cancer. Mini Rev Med Chem. 2022.

INCA. Estimativa 2020. In: Saúde Md, editor. Incidência de Câncer no Brasil. Brasil: Instituto Nacional de Câncer José Alencar Gomes da Silva (INCA); 2019.

Yersal O, Barutca S. Biological subtypes of breast cancer: Prognostic and therapeutic implications. World J Clin Oncol. 2014;5(3):412-24.

Milosevic M, Jankovic D, Milenkovic A, Stojanov D. Early diagnosis and detection of breast cancer. Technol Health Care. 2018;26(4):729-59.

Trister AD, Buist DSM, Lee CI. Will Machine Learning Tip the Balance in Breast Cancer Screening? JAMA Oncol. 2017;3(11):1463-4.

Montazeri M, Montazeri M, Montazeri M, Beigzadeh A. Machine learning models in breast cancer survival prediction. Technol Health Care. 2016;24(1):31-42.

Nandakumar A, Anantha N, Venugopal TC, Sankaranarayanan R, Thimmasetty K, Dhar M. Survival in breast cancer: a population-based study in Bangalore, India. Int J Cancer. 1995;60(5):593-6.

Puja G, Shruti G. Breast Cancer Prediction using varying Parameters of Machine Learning Models. Procedia Computer Science. 2020;171:593-601.

Pinheiro TS, Yahata E, Santos PDd, Oliveira FSd, Takahata AK, Suyama R, et al. Machine Learning e Análise Multivariada aplicados à Sobrevida do Câncer Mama. Journal of Health Informatics. 2022;14(0).

Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research. 2002;16:321-57.

Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. SIGKDD Explor Newsl. 2009;11(1):10–8.

Frank E, Hall M, Witten I. Appendix B - The WEKA workbench. In: Ian HW, Eibe F, Mark AH, Christopher JP, editors. Data Mining (Fourth Edition). Fourth Edition ed: Morgan Kaufmann; 2017. p. 553-71.

Nindrea RD, Aryandono T, Lazuardi L, Dwiprahasto I. Diagnostic Accuracy of Different Machine Learning Algorithms for Breast Cancer Risk Calculation: a Meta-Analysis. Asian Pac J Cancer Prev. 2018;19(7):1747-52.

Kalafi EY, Nor NAM, Taib NA, Ganggayah MD, Town C, Dhillon SK. Machine Learning and Deep Learning Approaches in Breast Cancer Survival Prediction Using Clinical Data. Folia Biol (Praha). 2019;65(5-6):212-20.

Le Thien MA, Redjdal A, Bouaud J, Seroussi B. Deep Learning, a Not so Magical Problem Solver: A Case Study with Predicting the Complexity of Breast Cancer Cases. Stud Health Technol Inform. 2021;287:144-8.

Freund Y, Schapire RE, editors. A desicion-theoretic generalization of on-line learning and an application to boosting. Computational Learning Theory; 1995 1995//; Berlin, Heidelberg: Springer Berlin Heidelberg.

Henry R, Meltzer MI. Etymologia: Bayesian Probability. Emerg Infect Dis. 2017;23(1):28.

Krishnan S. 6 - Machine learning for biomedical signal analysis. In: Krishnan S, editor. Biomedical Signal Analysis for Connected Healthcare: Academic Press; 2021. p. 223-64.

Biau G. Analysis of a Random Forests Model. Journal of Machine Learning Research. 2010;13.

Liaw A, Wiener M. Classification and regression by randomForest. R news. 2002;2(3):18-22.

Published

2023-07-20

How to Cite

Santos, P. D. dos, Yahata, E., Piheiro, T. S., Oliveira, F. S. de, & Simões, P. W. (2023). Machine Learning Algorithms for Prediction of Breast Cancer Survival. Journal of Health Informatics, 15(Especial). https://doi.org/10.59681/2175-4411.v15.iEspecial.2023.1091

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)