Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.

PLoS One

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, United Kingdom ; National Institute of Informatics, Tokyo, Japan.

Published: June 2014

The identification of phenotype descriptions in the scientific literature, case reports and patient records is a rewarding task for bio-medical text mining. Any progress will support knowledge discovery and linkage to other resources. However because of their wide variation a number of challenges still remain in terms of their identification and semantic normalisation before they can be fully exploited for research purposes. This paper presents novel techniques for identifying potential complex phenotype mentions by exploiting a hybrid model based on machine learning, rules and dictionary matching. A systematic study is made of how to combine sequence labels from these modules as well as the merits of various ontological resources. We evaluated our approach on a subset of Medline abstracts cited by the Online Mendelian Inheritance of Man database related to auto-immune diseases. Using partial matching the best micro-averaged F-score for phenotypes and five other entity classes was 79.9%. A best performance of 75.3% was achieved for phenotype candidates using all semantics resources. We observed the advantage of using SVM-based learn-to-rank for sequence label combination over maximum entropy and a priority list approach. The results indicate that the identification of simple entity types such as chemicals and genes are robustly supported by single semantic resources, whereas phenotypes require combinations. Altogether we conclude that our approach coped well with the compositional structure of phenotypes in the auto-immune domain.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3796529PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0072965PLOS

Publication Analysis

Top Keywords

phenotype candidates
8
learning recognize
4
phenotype
4
recognize phenotype
4
candidates auto-immune
4
auto-immune literature
4
literature svm
4
svm re-ranking
4
re-ranking identification
4
identification phenotype
4

Similar Publications

Extracellular vesicles from pancreatic cancer and its tumour microenvironment promote increased Schwann cell migration.

Br J Cancer

January 2025

Department of Visceral, Thoracic and Vascular Surgery, University Hospital and Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany.

Background: Pancreatic ductal adenocarcinoma (PDAC) exhibits a high frequency of neural invasion (NI). Schwann cells (SCs) have been shown to be reprogrammed to facilitate cancer cell migration and invasion into nerves. Since extracellular vesicles (EVs) affect the tumour microenvironment and promote metastasis, the present study analysed the involvement of EVs from pancreatic cancer cells and their microenvironment in altering SC phenotype as part of the early events in the process of NI.

View Article and Find Full Text PDF

Natural phytochemicals reverting M2 to M1 macrophages: A novel alternative Leishmaniasis therapy.

Microb Pathog

January 2025

Immunology lab, Biotechnology & Bioengineering, Indian Institute of Advanced Research, Gandhinagar, Gujarat, 382426, India. Electronic address:

Introduction: Leishmaniasis is a tropical parasitic disease caused by the protozoan Leishmania which remains a significant global health concern with diverse clinical manifestations. Transmitted through the bite of an infected sandfly, its progression depends on the interplay between the host immune response and the parasite. The disease outcome is linked to macrophage polarisation into M1 and M2 phenotypes.

View Article and Find Full Text PDF

Faba bean ( L.) is a valuable ingredient in plant-based foods such as meat and dairy analogues. However, its typical taste and aroma are considered off-flavours in these food applications, representing a bottleneck during processing.

View Article and Find Full Text PDF

is a member of the cruciferous family with rich glucosinolate (GSL) content, particularly glucobrassicin (3-indolylmethyl glucosinolate, I3M), that can be metabolized into indole-3-carbinol (I3C), a compound with promising anticancer properties. To unravel the genetic mechanism influencing I3C content in rapeseed seedlings, a comprehensive study was undertaken with a doubled haploid (DH) population. By quantitative trait loci (QTL) mapping, seven QTL that were located on A01, A07, and C04 were identified, with the most significant contribution to phenotypic variation observed on chromosome A07 (11.

View Article and Find Full Text PDF

Leaf shape is an important determinant of photosynthesis, yield and quality in plants. In this study, we obtained a curled leaf mutant, , from an ethyl methanesulfonate (EMS)-induced mutagenesis population. It was designated the locus.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!