Explainability in Machine Learning Predictive Models in Breast Cancer
DOI:
https://doi.org/10.59681/2175-4411.v15.iEspecial.2023.1090Keywords:
Breast Cancer, Machine Learning, Artificial IntelligenceAbstract
Objective: Artificial Intelligence shows promise as decision support in breast cancer, however, the explainability of algorithms such as black box can contribute to adoption in clinical practice. This study presents the explainability in Predictive Machine Learning Models in Breast Cancer. Methods: Two different Machine Learning approaches were evaluated, the Multilayer Perceptron (MLP) and the Extreme Gradient Boosting (XGBoost), considering a sample of 164 women who underwent Core Biopsy between 2014 and 2015. The Shapley Additive Explanation was used to explain the models. Results: Both predictive models presented an accuracy of 98.0% (95%CI: 94.2%-100.0%) and the BI-RADS® 5 in the ultrasound was considered the most important attribute. Conclusion: The models showed high predictive capacity for breast cancer; in the MLP, the ultrasound BI-RADS® stages 3 and 5 were the most important attributes, and in the XGB model, in addition to ultrasound, age and palpable nodule were the most important.
References
MINISTÉRIO DA SAÚDE. Câncer de mama: vamos falar sobre isso? [Internet]. 2021 [cited 2022 jul 01]. Available from: https://www.inca.gov.br/sites/ufu.sti.inca.local/files//media/document//cartilha-mama-6-edicao-2021.pdf
IARC marks Breast Cancer Awareness Month 2021 – IARC [Internet]. 2021 [cited 2022 Aug 12]. Available from: https://www.iarc.who.int/news-events/iarc-marks-breast-cancer-awareness-month-2021/
World Health Organization International Agency for Research on Cancer (IARC). Cancer today [Internet]. 2021. [cited 2022 Aug 12]. Available from: https://gco.iarc.fr/today/data/factsheets/cancers/20-Breast-fact-sheet.pdf
American Cancer Society. American cancer society recommendations for the early detection of breast cancer [Internet]. 2022 [cited 2022 jul 22]. Available from: https://www.cancer.org/cancer/breast-cancer/screening-tests-and-early-detection/american-cancer-society-recommendations-for-the-early-detection-of-breast-cancer.html
Caplan, L. Delay in breast cancer: implications for stage at diagnosis and survival. Front Public Health. 2014 Jul;2:87.
Verma B, McLeod P, Klevansky A. Classification of benign and malignant patterns in digital mammograms for the diagnosis of breast cancer. Expert Systems with Applications. 2010 Apr;37(4):3344–51
Chugh, G., Kumar, S., Singh, N. Survey on Machine Learning and Deep Learning Applications in Breast Cancer Diagnosis. Cogn Comput. 2021 Jan;13:1451–1470.
Battineni G, Sagaro GG, Chinatalapudi N, Amenta F. Applications of Machine Learning Predictive Models in the Chronic Disease Diagnosis. J Pers Med. 2020 Mar;10(2):21.
Handelman GS, Kok HK, Chandra RV, Razavi AH, Lee MJ, Asadi H. eDoctor: machine learning and the future of medicine. J Intern Med. 2018 Dec;284(6):603-619
Moncada-Torres A, van Maaren MC, Hendriks MP, Siesling S, Geleijnse G. Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival. Sci Rep. 2021 Mar 26;11(1):6968.
Amann J, Blasimme A, Vayena E, Frey D, Madai VI; Precise4Q consortium. Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Med Inform Decis Mak. 2020 Nov 30;20(1):310.
Cutillo CM, Sharma KR, Foschini L, Kundu S, Mackintosh M, Mandl KD; MI in Healthcare Workshop Working Group. Machine intelligence in healthcare-perspectives on trustworthiness, explainability, usability, and transparency. NPJ Digit Med. 2020 Mar 26;3:47.
Agostin G de, Formigoni RR, Yokoi LM, Winnikow EP, Simões PW. Redes Bayesianas e Regressão Logística em pacientes submetidas a core biópsia. Journal of Health Informatics. 2021 Mar 15 ;12.
Dahouda MK, Joe I. A Deep-Learned Embedding Technique for Categorical Features Encoding. IEEE Access. 2021;9:114381–91
Khalid S, Khalil T, Nasreen S. A survey of feature selection and feature extraction techniques in machine learning. In: 2014 science and information conference; 2014 Aug. p. 372-378.
Akoglu H. User's guide to correlation coefficients. Turk J Emerg Med. 2018 Aug 7;18(3):91-93.
Chan JY-L, Leow SMH, Bea KT, Cheng WK, Phoong SW, Hong Z-W, et al. Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review. Mathematics. 2022 Apr 12;10(8):1283
Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak. 2019 Dec 21;19(1):281.
Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable [Internet]. 2022. [cited 2022 jul 15]. Available from: christophm.github.io/interpretable-ml-book/
Hille H, Vetter M, Hackelöer BJ. The accuracy of BI-RADS classification of breast ultrasound as a first-line imaging method. Ultraschall Med. 2012 Apr;33(2):160-3.
Kim BK, Ryu JM, Oh SJ, Han J, Choi JE, Jeong J, Suh YJ, Lee J, Sun WY; Korean Breast Cancer Society. Comparison of clinicopathological characteristics and prognosis in breast cancer patients with different Breast Imaging Reporting and Data System categories. Ann Surg Treat Res. 2021 Sep;101(3):131-139.
Sun YS, Zhao Z, Yang ZN, Xu F, Lu HJ, Zhu ZY, Shi W, Jiang J, Yao PP, Zhu HP. Risk Factors and Preventions of Breast Cancer. Int J Biol Sci. 2017 Nov 1;13(11):1387-1397.
National Cancer Institute. Female Breast Cancer - Cancer Stat Facts [Internet]. SEER. 2022 [cited 2022 jul 01]. Available from: https://seer.cancer.gov/statfacts/html/breast.html
Cancer Research UK. Breast cancer mortality statistics [Internet]. 2015. [cited 2022 jul 15]. Available from: https://www.cancerresearchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/breast-cancer/mortality#heading-One
IARC. Breast Cancer Screening [Internet]. 2016 [cited 2022 jul 15]. Available from: https://publications.iarc.fr/Book-And-Report-Series/Iarc-Handbooks-Of-Cancer-Prevention/Breast-Cancer-Screening-2016
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Erika Yahata, Erik Paul Winnikow, Ricardo Suyama, Priscyla Waleska Simões
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Submission of a paper to Journal of Health Informatics is understood to imply that it is not being considered for publication elsewhere and that the author(s) permission to publish his/her (their) article(s) in this Journal implies the exclusive authorization of the publishers to deal with all issues concerning the copyright therein. Upon the submission of an article, authors will be asked to sign a Copyright Notice. Acceptance of the agreement will ensure the widest possible dissemination of information. An e-mail will be sent to the corresponding author confirming receipt of the manuscript and acceptance of the agreement.