PheneBank: a literature-based database of phenotypes.

Bioinformatics

Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK.

Published: January 2022

Motivation: Significant effort has been spent by curators to create coding systems for phenotypes such as the Human Phenotype Ontology, as well as disease-phenotype annotations. We aim to support the discovery of literature-based phenotypes and integrate them into the knowledge discovery process.

Results: PheneBank is a Web-portal for retrieving human phenotype-disease associations that have been text-mined from the whole of Medline. Our approach exploits state-of-the-art machine learning for concept identification by utilizing an expert annotated rare disease corpus from the PMC Text Mining subset. Evaluation of the system for entities is conducted on a gold-standard corpus of rare disease sentences and for associations against the Monarch initiative data.

Availability And Implementation: The PheneBank Web-portal freely available at http://www.phenebank.org. Annotated Medline data is available from Zenodo at DOI: 10.5281/zenodo.1408800. Semantic annotation software is freely available for non-commercial use at GitHub: https://github.com/pilehvar/phenebank.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8796364PMC
http://dx.doi.org/10.1093/bioinformatics/btab740DOI Listing

Publication Analysis

Top Keywords

phenebank web-portal
8
rare disease
8
phenebank literature-based
4
literature-based database
4
database phenotypes
4
phenotypes motivation
4
motivation effort
4
effort spent
4
spent curators
4
curators create
4

Similar Publications

PheneBank: a literature-based database of phenotypes.

Bioinformatics

January 2022

Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK.

Motivation: Significant effort has been spent by curators to create coding systems for phenotypes such as the Human Phenotype Ontology, as well as disease-phenotype annotations. We aim to support the discovery of literature-based phenotypes and integrate them into the knowledge discovery process.

Results: PheneBank is a Web-portal for retrieving human phenotype-disease associations that have been text-mined from the whole of Medline.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!