Unlocking the complete blood count as a risk stratification tool for breast cancer using machine learning

Authors

  • Daniella Castro Araújo Huna Ltd.
  • Bruno Aragão Rocha Grupo Fleury
  • Karina Braga Gomes Universidade Federal de Minas Gerais
  • Daniel Noce da Silva Huna Ltd.
  • Vinicius Moura Ribeiro Huna Ltd.
  • Marco Aurelio Kohara Huna Ltd.
  • Adriano Alonso Veloso Universidade Federal de Minas Gerais
  • Flavia Helena da Silva Grupo Fleury
  • Pedro Henrique Araújo de Souza Instituto Nacional de Câncer
  • Ismael Dale Cotrim Guerreiro da Silva Federal University of São Paulo

DOI:

https://doi.org/10.59681/2175-4411.v16.iEspecial.2024.1355

Keywords:

Blood Cell Count, Machine Learning, Breast Cancer

Abstract

Objective: To evaluate the efficacy of machine learning (ML) in using complete blood count (CBC) for breast cancer risk assessment. Method: This retrospective study analyzed CBCs from 396,848 women aged 40 to 70. A total of 2861 cases were identified (1882 confirmed by biopsy and 979 by imaging), while 393,987 were controls (BI-RADS 1 or 2). Data were divided into modeling (training and validation) and testing sets based on diagnostic certainty. Results: The ridge regression model, incorporating the neutrophil-to-lymphocyte ratio, red blood cells, and age, achieved an AUC of 0.64. The study population was stratified into four risk groups: high, moderate, medium, and low, with relative ratios of 1.99, 1.32, 1.02, and 0.42, respectively. Conclusion: This ML model provides a cost-effective tool for personalized breast cancer screening, potentially improving early detection in resource-limited settings.

Author Biographies

Daniella Castro Araújo, Huna Ltd.

PhD, Founder & CTO, Huna Ltd., São Paulo, Brazil.

Bruno Aragão Rocha, Grupo Fleury

MD, Coordenador Médico de Inovação, Grupo Fleury, São Paulo, Brazil.

Karina Braga Gomes, Universidade Federal de Minas Gerais

Prof. PhD, Departamento de Análises Clínicas e Toxicológicas, Faculdade de Farmácia, Universidade Federal de Minas Gerais/UFMG, Campus Belo Horizonte, Minas Gerais, Brazil

Daniel Noce da Silva, Huna Ltd.

MSc, Huna Ltd., São Paulo, Brazil.

Vinicius Moura Ribeiro, Huna Ltd.

Founder & CEO, Huna Ltd., São Paulo, Brazil.

Marco Aurelio Kohara, Huna Ltd.

Founder & COO, Huna Ltd., São Paulo, Brazil.

Adriano Alonso Veloso, Universidade Federal de Minas Gerais

Prof. PhD, Departamento de Ciências da Computação, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais/UFMG, Campus Belo Horizonte, Minas Gerais, Brazil

Flavia Helena da Silva, Grupo Fleury

PhD, Gerente Sênior Inteligência Analytics, Grupo Fleury, São Paulo, Brazil.

Pedro Henrique Araújo de Souza, Instituto Nacional de Câncer

MSc, MD, Oncologista, Department of Oncology Clinical Research, Instituto Nacional de Câncer (INCA), Rio de Janeiro, Brazil

Ismael Dale Cotrim Guerreiro da Silva, Federal University of São Paulo

Prof. PhD, MD, Department of Gynecology, Escola Paulista de Medicina, Federal University of São Paulo, São Paulo, Brazil

References

Coleman C. Early Detection and Screening for Breast Cancer. Semin Oncol Nurs. 2017 May;33(2):141–55.

Araujo DC, Rocha BA, Gomes KB, da Silva DN, Ribeiro VM, Kohara MA, et al. Unlocking the complete blood count as a risk stratification tool for breast cancer using machine learning: a large scale retrospective study. Sci Rep. 2024 May 12;14(1):1–10.

Zhang K, Bangma CH, Venderbos LDF, Roobol MJ. Individual and Population-Based Screening. Management of Prostate Cancer. 2017;43–55.

Cuzick J. A breast cancer prediction model incorporating familial and personal risk factors. Hered Cancer Clin Pract. 2012;10(Suppl 2):A29.

Yala A, Mikhael PG, Strand F, Lin G, Satuluru S, Kim T, et al. Multi-Institutional Validation of a Mammography-Based Breast Cancer Risk Model. J Clin Oncol [Internet]. 2022 Jun 1 [cited 2024 May 28];40(16). Available from: https://pubmed.ncbi.nlm.nih.gov/34767469/

Danesh H, Ziamajidi N, Mesbah-Namin SA, Nafisi N, Abbasalipourkabir R. Association between Oxidative Stress Parameters and Hematological Indices in Breast Cancer Patients. Int J Breast Cancer [Internet]. 2022 Oct 3 [cited 2024 May 29];2022. Available from: https://pubmed.ncbi.nlm.nih.gov/36225290/

Amin MB, Greene FL, Edge SB, Compton CC, Gershenwald JE, Brookland RK, et al. The Eighth Edition AJCC Cancer Staging Manual: Continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA Cancer J Clin. 2017 Mar;67(2):93–9.

Hoerl AE, Kennard RW. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics [Internet]. 1970 Feb 1 [cited 2024 May 29]; Available from: https://www.tandfonline.com/doi/abs/10.1080/00401706.1970.10488634

Ke G, Meng Q, Finely T, Wang T, Chen W, Ma W, et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In: Advances in Neural Information Processing Systems 30 (NIP 2017) [Internet]. 2017 [cited 2024 May 30]. Available from: https://www.microsoft.com/en-us/research/publication/lightgbm-a-highly-efficient-gradient-boosting-decision-tree/

Zuin G, Araujo D, Ribeiro V, Seiler MG, Prieto WH, Pintão MC, et al. Prediction of SARS-CoV-2-positivity from million-scale complete blood counts using machine learning. Communications Medicine. 2022 Jun 15;2(1):1–12.

Amador T, Saturnino S, Veloso A, Ziviani N. Early identification of ICU patients at risk of complications: Regularization based on robustness and stability of explanations. Artif Intell Med [Internet]. 2022 Jun [cited 2024 May 30];128. Available from: https://pubmed.ncbi.nlm.nih.gov/35534141/

Michaels E, Worthington RO, Rusiecki J. Breast Cancer: Risk Assessment, Screening, and Primary Prevention. Med Clin North Am [Internet]. 2023 Mar [cited 2024 May 30];107(2). Available from: https://pubmed.ncbi.nlm.nih.gov/36759097/

Ethier JL, Desautels D, Templeton A, Shah PS, Amir E. Prognostic role of neutrophil-to-lymphocyte ratio in breast cancer: a systematic review and meta-analysis. Breast Cancer Res [Internet]. 2017 [cited 2024 May 30];19. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5217326/

De Larco JE, Wuertz BR, Furcht LT. The potential role of neutrophils in promoting the metastatic phenotype of tumors releasing interleukin-8. Clin Cancer Res [Internet]. 2004 Aug 1 [cited 2024 May 30];10(15). Available from: https://pubmed.ncbi.nlm.nih.gov/15297389/

Katano M, Torisu M. Neutrophil-mediated tumor cell destruction in cancer ascites. Cancer [Internet]. 1982 Jul 1 [cited 2024 May 30];50(1). Available from: https://pubmed.ncbi.nlm.nih.gov/7083126/

Gago-Dominguez M, Matabuena M, Redondo CM, Patel SP, Carracedo A, Ponte SM, et al. Neutrophil to lymphocyte ratio and breast cancer risk: analysis by subtype and potential interactions. Sci Rep [Internet]. 2020 Aug 6 [cited 2024 May 30];10(1). Available from: https://pubmed.ncbi.nlm.nih.gov/32764699/

Mantovani A, Allavena P, Sica A, Balkwill F. Cancer-related inflammation. Nature [Internet]. 2008 Jul 24 [cited 2024 May 30];454(7203). Available from: https://pubmed.ncbi.nlm.nih.gov/18650914/

Published

2024-11-19

How to Cite

Araújo, D. C., Rocha, B. A., Gomes, K. B., da Silva, D. N., Ribeiro, V. M., Kohara, M. A., … da Silva, I. D. C. G. (2024). Unlocking the complete blood count as a risk stratification tool for breast cancer using machine learning. Journal of Health Informatics, 16(Especial). https://doi.org/10.59681/2175-4411.v16.iEspecial.2024.1355

Similar Articles

<< < 1 2 3 4 5 6 > >> 

You may also start an advanced similarity search for this article.