A Multilabel Approach to Portuguese Clinical Named Entity Recognition
Keywords:
Clinical Named Entity Recognition, Label Powerset, BERTAbstract
Objectives: Clinical Named Entity Recognition is a critical Natural Language Processing task, as it could support biomedical research and healthcare systems. While most extracted clinical entities are based on single-label concepts, it is very common in the clinical domain entities with more than one semantic category simultaneously. This work proposes BERT-based models to support multilabel clinical named entity recognition in the Portuguese language. Methods: For the experiment, we used the Label Powerset method applied to the multilabel corpus SemClinBr. Results: We compare our results with a Conditional Random Fields baseline, reaching +2.1 in precision, +11.2 in recall, and +7.4 in F1 with a clinical-biomedical BERT model (BioBERTpt). Conclusion: We achieved higher results for both exact and partial metrics, contributing to the multilabel semantic processing of clinical narratives in Portuguese.Downloads
Published
2021-03-15
How to Cite
Souza, J. V. A. de, Schneider, E. T. R., Cezar, J. O., Oliveira, L. E. S. e, Gumiel, Y. B., Paraiso, E. C., … Barra, C. M. C. M. (2021). A Multilabel Approach to Portuguese Clinical Named Entity Recognition. Journal of Health Informatics, 12. Retrieved from https://jhi.sbis.org.br/index.php/jhi-sbis/article/view/840
Issue
Section
Original Articles