A Multilabel Approach to Portuguese Clinical Named Entity Recognition

Autores

  • João Vitor Andrioli de Souza SBIS
  • Elisa Terumi Rubel Schneider
  • Josilaine Oliveira Cezar
  • Lucas Emanuel Silva e Oliveira
  • Yohan Bonescki Gumiel
  • Emerson Cabrera Paraiso
  • Douglas Teodoro
  • Claudia Maria Cabral Moro Barra

Palavras-chave:

Clinical Named Entity Recognition, Label Powerset, BERT

Resumo

Objectives: Clinical Named Entity Recognition is a critical Natural Language Processing task, as it could support biomedical research and healthcare systems. While most extracted clinical entities are based on single-label concepts, it is very common in the clinical domain entities with more than one semantic category simultaneously. This work proposes BERT-based models to support multilabel clinical named entity recognition in the Portuguese language. Methods: For the experiment, we used the Label Powerset method applied to the multilabel corpus SemClinBr. Results: We compare our results with a Conditional Random Fields baseline, reaching +2.1 in precision, +11.2 in recall, and +7.4 in F1 with a clinical-biomedical BERT model (BioBERTpt). Conclusion: We achieved higher results for both exact and partial metrics, contributing to the multilabel semantic processing of clinical narratives in Portuguese.

Downloads

Publicado

15-03-2021

Como Citar

Souza, J. V. A. de, Schneider, E. T. R., Cezar, J. O., Oliveira, L. E. S. e, Gumiel, Y. B., Paraiso, E. C., … Barra, C. M. C. M. (2021). A Multilabel Approach to Portuguese Clinical Named Entity Recognition. Journal of Health Informatics, 12. Recuperado de https://jhi.sbis.org.br/index.php/jhi-sbis/article/view/840

Artigos Semelhantes

1 2 > >> 

Você também pode iniciar uma pesquisa avançada por similaridade para este artigo.

Artigos mais lidos pelo mesmo(s) autor(es)