Similarity-based scoring method for classification of Health Informatics content

Autores

  • Fabio Teixeira Universidade Federal de São Paulo
  • Alex Jaccoud Falcão Universidade Federal de São Paulo
  • Fernando Siqueira Sousa Universidade Federal de São Paulo
  • Anderson Diniz Hummel Universidade Federal de São Paulo
  • Thiago Martini Costa Universidade Federal de São Paulo
  • Felipe Mancini Universidade Federal de São Paulo
  • Luciano Vieira Araujo Universidade de São Paulo
  • Ivan Torres Pisa Universidade Federal de São Paulo

Palavras-chave:

Vocabulary, Controlled, Classification, Artificial Intelligence, Data Analysis

Resumo

Objective: There has been a considerable growth of the architecture and complexity of digital repositories in Health Informatics (HI). For information retrieval different information treatment and representation, such as automatic content classification, are required. The purpose of this study is to present the results of a procedure for automatic classification of scientific articles in HI using a specific thesaurus. Design: Statistical, vector, and artificial intelligence methods were applied to classify HI-related content. Articles extracted from the HI and Health journals and a specialized HI thesaurus were used for method application and result evaluation. Measurements: Statistical procedures and measures of accuracy, precision, recall, area under the ROC curve, and combination of precision and recall (F1 measure) were performed to measure the degree of similarity between terms of the specialized HI thesaurus and the selected articles. Results: The percentage of accuracy achieved was 0.87, F1 measure was 0.87 and the area under the ROC curve was 0.94. Conclusion: The results were positive, showing that the use of a specialized thesaurus on Health Informatics in conjunction with the methods used allows the classification of articles in the areas of Health Informatics and Health.

Biografia Autor

Fabio Teixeira, Universidade Federal de São Paulo

Departamento de Informática em Saúde - Universidade Federal de São Paulo

Publicado

2011-06-29

Como Citar

Teixeira, F., Falcão, A. J., Sousa, F. S., Hummel, A. D., Costa, T. M., Mancini, F., … Pisa, I. T. (2011). Similarity-based scoring method for classification of Health Informatics content. Journal of Health Informatics, 3(2). Obtido de https://jhi.sbis.org.br/index.php/jhi-sbis/article/view/137

Edição

Secção

Artigo Original

Artigos Similares

1 2 3 4 5 6 7 8 9 10 > >> 

Também poderá iniciar uma pesquisa avançada de similaridade para este artigo.

Artigos mais lidos do(s) mesmo(s) autor(es)

<< < 1 2 3 4