UPM Institutional Repository

Improving named entity recognition accuracy for gene and protein in biomedical text literature


Citation

Tohidi, Hossein and Ibrahim, Hamidah and Azmi Murad, Masrah Azrifah (2014) Improving named entity recognition accuracy for gene and protein in biomedical text literature. International Journal of Data Mining and Bioinformatics, 10 (3). pp. 239-268. ISSN 1748-5673; ESSN: 1748-5681

Abstract

The task of recognising biomedical named entities in natural language documents called biomedical Named Entity Recognition (NER) is the focus of many researchers due to complex nature of such texts. This complexity includes the issues of character-level, word-level and word order variations. In this study, an approach for recognising gene and protein names that handles the above issues is proposed. Similar to the previous related works, our approach is based on the assumption that a named entity occurs within a noun group. The strength of our proposed approach lies on a Statistical Character-based Syntax Similarity (SCSS) algorithm which measures similarity between the extracted candidates and the well-known biomedical named entities from the GENIA V3.0 corpus. The proposed approach is evaluated and results are satisfied. For recognitions of both gene and protein names, we achieved 97.2% for precision (P), 95.2% for recall (R), and 96.1 for F-measure. While for protein names recognition we gained 98.1% for P, 97.5% for R and 97.7 for F-measure.


Download File

[img]
Preview
PDF (Abstract)
Improving named entity recognition accuracy for gene and protein in biomedical text literature.pdf

Download (35kB) | Preview

Additional Metadata

Item Type: Article
Divisions: Faculty of Computer Science and Information Technology
DOI Number: https://doi.org/10.1504/IJDMB.2014.064523
Publisher: Inderscience Publishers
Keywords: Biomedical; Information extraction; Named entity recognition; Natural language processing; NER
Depositing User: Nabilah Mustapa
Date Deposited: 29 Dec 2015 09:05
Last Modified: 29 Dec 2015 09:05
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1504/IJDMB.2014.064523
URI: http://psasir.upm.edu.my/id/eprint/37986
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item