A Multilabel Approach to Portuguese Clinical Named Entity Recognition

Authors

  • João Vitor Andrioli de Souza SBIS
  • Elisa Terumi Rubel Schneider
  • Josilaine Oliveira Cezar
  • Lucas Emanuel Silva e Oliveira
  • Yohan Bonescki Gumiel
  • Emerson Cabrera Paraiso
  • Douglas Teodoro
  • Claudia Maria Cabral Moro Barra

Keywords:

Clinical Named Entity Recognition, Label Powerset, BERT

Abstract

Objectives: Clinical Named Entity Recognition is a critical Natural Language Processing task, as it could support biomedical research and healthcare systems. While most extracted clinical entities are based on single-label concepts, it is very common in the clinical domain entities with more than one semantic category simultaneously. This work proposes BERT-based models to support multilabel clinical named entity recognition in the Portuguese language. Methods: For the experiment, we used the Label Powerset method applied to the multilabel corpus SemClinBr. Results: We compare our results with a Conditional Random Fields baseline, reaching +2.1 in precision, +11.2 in recall, and +7.4 in F1 with a clinical-biomedical BERT model (BioBERTpt). Conclusion: We achieved higher results for both exact and partial metrics, contributing to the multilabel semantic processing of clinical narratives in Portuguese.

Published

2021-03-15

How to Cite

Souza, J. V. A. de, Schneider, E. T. R., Cezar, J. O., Oliveira, L. E. S. e, Gumiel, Y. B., Paraiso, E. C., … Barra, C. M. C. M. (2021). A Multilabel Approach to Portuguese Clinical Named Entity Recognition. Journal of Health Informatics, 12. Retrieved from https://jhi.sbis.org.br/index.php/jhi-sbis/article/view/840

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)