FEBS Letters
Volume 582, Issue 8 , Pages 1178-1181, 9 April 2008

A text-mining perspective on the requirements for electronically annotated abstracts

Structural Computational Biology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain

published online 06 March 2008.

Abstract 

We propose that the combination of human expertise and automatic text-mining systems can be used to create a first generation of electronically annotated information (EAI) that can be added to journal abstracts and that is directly related to the information in the corresponding text. The first experiments have concentrated on the annotation of gene/protein names and those of organisms, as these are the best resolved problems. A second generation of systems could then attempt to address the problems of annotating protein interactions and protein/gene functions, a more difficult task for text-mining systems. EAI will permit easier categorization of this information, it will help in the evaluation of papers for their curation in databases, and it will be invaluable for maintaining the links between the information in databases and the facts described in text. Additionally, it will contribute to the efforts towards completing database information and creating collections of annotated text that can be used to train new generations of text-mining systems. The recent introduction of the first meta-server for the annotation of biological text, with the possibility of collecting annotations from available text-mining systems, adds credibility to the technical feasibility of this proposal.

Abbreviations: BCMS, BioCreative MetaServer, EAI, electronically annotated information, IE, information extraction, NER, named entity recognition, NLP, natural language processing, NLU, natural language understanding

Keywords: Information extraction, Article annotation, Text mining, Journal annotation pipeline, Review, Perspectives, Electronically annotated information

To access this article, please choose from the options below

Login to an existing account or Register a new account.

  • Purchase this article for 31.50 USD (You must login/register to purchase this article)

    Online access for 24 hours. The PDF version can be downloaded as your permanent record.

  • Subscribe to this title

    Get unlimited online access to this article and all other articles in this title 24/7 for one year.

  • Claim access now

    For current subscribers with Society Membership or Account Number.

  • Visit SciVerse ScienceDirect to see if you have access via your institution.
 

PII: S0014-5793(08)00195-6

doi:10.1016/j.febslet.2008.02.072

FEBS Letters
Volume 582, Issue 8 , Pages 1178-1181, 9 April 2008