Enhancing automated electrocardiogram (ECG) diagnosis through multimodal pre-training with text reports
DOI:
https://doi.org/10.59681/2175-4411.v16.iEspecial.2024.1368Keywords:
Machine Learning, Electrocardiography, CardiologyAbstract
Objetivo: Doenças cardíacas são a principal causa de morte globalmente, e o eletrocardiograma (ECG) é a principal ferramenta para avaliar a atividade cardíaca. O diagnóstico automatizado e remoto do ECG pode ajudar o sistema de saúde com avaliações cardíacas antecipadas e precisas, especialmente em regiões periféricas e áreas rurais. A classificação automática de ECG foi amplamente pesquisada, mas ainda é um desafio criar modelos precisos para um espectro tão amplo. Método: Este estudo aprimora o desempenho dos modelos de classificação de aprendizagem profunda de ECG usando um estágio de pré-treinamento multimodal com o laudo médico. Resultados: Nossa abordagem melhora o modelo estado-da-arte e atinge uma pontuação média de F1 de 0,755 em seis categorias usando o conjunto de dados completo, o que é uma melhoria relevante para um corpus não-rotulado relativamente grande. Conclusão: Os resultados demonstram o potencial de melhora da avaliação cardíaca automatizada com o pré-treinamento de texto.
References
M. Alkmim, A. Ribeiro, G. Carvalho, M. Pena, R. Figueira, and M. Carvalho. Success factors and difficulties for implementation of a telehealth system for remote villages: Minas telecardio project case in brazil. J Health Technol Appl, 5(3):197–202, 2007.
M. B. Alkmim, R. M. Figueira, M. S. Marcolino, C. S. Cardoso, M. P. d. Abreu, L. R. Cunha, D. F. d. Cunha, A. P. Antunes, A. G. d. A. Resende, E. S. Resende, et al. Improving patient access to specialized health care: the telehealth network of minas gerais, brazil. Bulletin of the World Health Organization, 90:373–378, 2012. DOI: https://doi.org/10.2471/BLT.11.099408
S. Bannur, S. Hyland, Q. Liu, F. Perez-Garcia, M. Ilse, D. C. Castro, B. Boeck- ing, H. Sharma, K. Bouzid, A. Thieme, et al. Learning to exploit temporal structure for biomedical vision-language processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15016–15027, 2023. DOI: https://doi.org/10.1109/CVPR52729.2023.01442
I. Bica, A. Ili ́c, M. Bauer, G. Erdogan, M. Boˇsnjak, C. Kaplanis, A. A. Grit- senko, M. Minderer, C. Blundell, R. Pascanu, et al. Improving fine-grained understanding in image-text pre-training. arXiv preprint arXiv:2401.09865, 2024.
J. Chai, H. Zeng, A. Li, and E. W. Ngai. Deep learning in computer vision: A critical review of emerging techniques and application scenarios. Machine Learning with Applications, 6:100134, 2021. DOI: https://doi.org/10.1016/j.mlwa.2021.100134
Z. Chen, A. H. Cano, A. Romanou, A. Bonnet, K. Matoba, F. Salvi, M. Pagliardini, S. Fan, A. K ̈opf, A. Mohtashami, et al. Meditron-70b: Scaling medical pretraining for large language models. arXiv preprint arXiv:2311.16079, 2023.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
A. Esteva, K. Chou, S. Yeung, N. Naik, A. Madani, A. Mottaghi, Y. Liu, E. Topol, J. Dean, and R. Socher. Deep learning-enabled medical computer vision. NPJ digital medicine, 4(1):5, 2021. DOI: https://doi.org/10.1038/s41746-020-00376-2
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recogni- tion. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90
P. A. Jennett, L. A. Hall, D. Hailey, A. Ohinmaa, C. Anderson, R. Thomas, B. Young, D. Lorenzetti, and R. E. Scott. The socio-economic impact of telehealth: a systematic review. Journal of telemedicine and telecare, 9(6): 311–320, 2003. DOI: https://doi.org/10.1258/135763303771005207
E. M. Lima, A. H. Ribeiro, G. M. Paix ̃ao, M. H. Ribeiro, M. M. Pinto-Filho, P. R. Gomes, D. M. Oliveira, E. C. Sabino, B. B. Duncan, L. Giatti, et al. Deep neural network-estimated electrocardiographic age as a mortality predictor. Nature communications, 12(1):5117, 2021. DOI: https://doi.org/10.1038/s41467-021-25351-7
P. Macfarlane, B. Devine, S. Latif, S. McLaughlin, D. Shoat, and M. Watts. Methodology of ecg interpretation in the glasgow program. Methods of information in medicine, 29(04):354–361, 1990. DOI: https://doi.org/10.1055/s-0038-1634799
P. Macfarlane, B. Devine, and E. Clark. The university of glasgow (uni-g) ecg analysis program. In Computers in Cardiology, 2005, pages 451–454. IEEE, 2005. DOI: https://doi.org/10.1109/CIC.2005.1588134
P. Messina, P. Pino, D. Parra, A. Soto, C. Besa, S. Uribe, M. Andia, C. Tejos, C. Prieto, and D. Capurro. A survey on deep learning and explainability for automatic report generation from medical images. ACM Computing Surveys (CSUR), 54(10s):1–40, 2022. DOI: https://doi.org/10.1145/3522747
M. Moor, Q. Huang, S. Wu, M. Yasunaga, C. Zakka, Y. Dalmia, E. Reis, P. Rajpurkar, and J. Leskovec. Med-flamingo: a multimodal medical few- shot learner (2023). URL: https://arxiv. org/abs/2307.15189, 2023.
D. W. Otter, J. R. Medina, and J. K. Kalita. A survey of the usages of deep learning for natural language processing. IEEE transactions on neural net- works and learning systems, 32(2):604–624, 2020. DOI: https://doi.org/10.1109/TNNLS.2020.2979670
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
A. H. Ribeiro, M. H. Ribeiro, G. M. Paix ̃ao, D. M. Oliveira, P. R. Gomes, J. A. Canazart, M. P. Ferreira, C. R. Andersson, P. W. Macfarlane, W. Meira Jr, et al. Automatic diagnosis of the 12-lead ecg using a deep neural network. Nature communications, 11(1):1760, 2020. DOI: https://doi.org/10.1038/s41467-020-15432-4
A. H. Ribeiro, G. Paixao, E. M. Lima, M. H. Ribeiro, M. M. Pinto Filho, P. R. Gomes, D. M. Oliveira, W. Meira Jr, T. B. Schon, and A. L. P. Ribeiro. Code-15%: A large scale annotated dataset of 12-lead ecgs. Zenodo, Jun, 9, 2021.
G. A. Roth, C. Johnson, A. Abajobir, F. Abd-Allah, S. F. Abera, G. Abyu, M. Ahmed, B. Aksut, T. Alam, K. Alam, et al. Global, regional, and national burden of cardiovascular diseases for 10 causes, 1990 to 2015. Journal of the American college of cardiology, 70(1):1–25, 2017.
E. T. R. Schneider, J. V. A. de Souza, J. Knafou, L. E. S. e. Oliveira, J. Co- para, Y. B. Gumiel, L. F. A. d. Oliveira, E. C. Paraiso, D. Teodoro, and C. M. C. M. Barra. BioBERTpt - a Portuguese neural language model for clinical named entity recognition. In Proceedings of the 3rd Clinical Natural Language Processing Workshop, pages 65–72, Online, Nov. 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/ 2020.clinicalnlp-1.7. DOI: https://doi.org/10.18653/v1/2020.clinicalnlp-1.7
S. Wu, K. Roberts, S. Datta, J. Du, Z. Ji, Y. Si, S. Soni, Q. Wang, Q. Wei, Y. Xiang, et al. Deep learning in clinical natural language processing: a methodical review. Journal of the American Medical Informatics Association, 27(3):457–470, 2020. DOI: https://doi.org/10.1093/jamia/ocz200
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Submission of a paper to Journal of Health Informatics is understood to imply that it is not being considered for publication elsewhere and that the author(s) permission to publish his/her (their) article(s) in this Journal implies the exclusive authorization of the publishers to deal with all issues concerning the copyright therein. Upon the submission of an article, authors will be asked to sign a Copyright Notice. Acceptance of the agreement will ensure the widest possible dissemination of information. An e-mail will be sent to the corresponding author confirming receipt of the manuscript and acceptance of the agreement.