Recognizing Biomedical Named Entities using Skip-chain Conditional Random Fields

dc.contributor.authorLiu, Jingchen
dc.contributor.authorHuang, Minlie
dc.contributor.authorZhu, Xiaoyan
dc.date.accessioned2012-05-22T20:30:45Z
dc.date.available2012-05-22T20:30:45Z
dc.date.copyright2010
dc.date.issued2010-07
dc.descriptionProceedings of the 2010 Workshop on Biomedical Natural Language Processingen
dc.description.abstractLinear-chain Conditional Random Fields (CRF) has been applied to perform the Named Entity Recognition (NER) task in many biomedical text mining and information extraction systems. However, the linear-chain CRF cannot capture long distance dependency, which is very common in the biomedical literature. In this paper, we propose a novel study of capturing such long distance dependency by defining two principles of constructing skip-edges for a skip-chain CRF: linking similar words and linking words having typed dependencies. The approach is applied to recognize gene/protein mentions in the literature. When tested on the BioCreAtIvE II Gene Mention dataset and GENIA corpus, the approach contributes significant improvements over the linear-chain CRF. We also present in-depth error analysis on inconsistent labeling and study the influence of the quality of skip edges on the labeling performance.en
dc.formatTexten
dc.format.extent1 digital file (p. 10-18 : ill.)en
dc.identifier.citationLiu, J., Huang, M., & Zhu, X. (2010). Recognizing Biomedical Named Entities using Skip-chain Conditional Random Fields. Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, Uppsala, SWE. (p. 10-18). Association for Computational Linguistics.en
dc.identifier.urihttp://hdl.handle.net/10625/49054
dc.language.isoen
dc.publisherAssociation for Computational Linguistics, Stroudsburg, PAen
dc.subjectRANDOM FIELDen
dc.subjectINFORMATION RETRIEVALen
dc.subjectBIOMEDICAL DATAen
dc.subjectERROR ANALYSISen
dc.subjectLABELLINGen
dc.titleRecognizing Biomedical Named Entities using Skip-chain Conditional Random Fieldsen
dc.typeConference Paperen
idrc.copyright.holderAssociation for Computational Linguistics
idrc.dspace.accessIDRC Onlyen
idrc.noaccessDue to copyright restrictions the full text of this research output is not available in the IDRC Digital Library or by request from the IDRC Library. / Compte tenu des restrictions relatives au droit d'auteur, le texte intégral de cet extrant de recherche n'est pas accessible dans la Bibliothèque numérique du CRDI, et il n'est pas possible d'en faire la demande à la Bibliothéque du CRDI.en
idrc.project.componentnumber104519006
idrc.project.number104519
idrc.project.titleInternational Research Chairs Initiative (IRCI)en
idrc.rims.adhocgroupIDRC SUPPORTEDen

Files