214 results match your criteria: "National Center for Biotechnology Information NCBI[Affiliation]"

DIGNiFI: Discovering causative genes for orphan diseases using protein-protein interaction networks.

BMC Syst Biol

March 2017

National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health, Bethesda, 20894, MD, USA.

Background: An orphan disease is any disease that affects a small percentage of the population. Orphan diseases are a great burden to patients and society, and most of them are genetic in origin. Unfortunately, our current understanding of the genes responsible for inherited orphan diseases is still quite limited.

View Article and Find Full Text PDF

Conserved signatures indicate HIV-1 transmission is under strong selection and thus is not a "stochastic" process.

Retrovirology

February 2017

Statistical Computational Biology Group, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, USA.

Recently, Oberle et al. published a paper in Retrovirology evaluating the question of whether selection plays a role in HIV transmission. The Oberle study found no obvious genotypic or phenotypic differences between donors and recipients of epidemiologically linked pairs from the Swiss cohort.

View Article and Find Full Text PDF

Diversity and evolution of class 2 CRISPR-Cas systems.

Nat Rev Microbiol

March 2017

National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.

Class 2 CRISPR-Cas systems are characterized by effector modules that consist of a single multidomain protein, such as Cas9 or Cpf1. We designed a computational pipeline for the discovery of novel class 2 variants and used it to identify six new CRISPR-Cas subtypes. The diverse properties of these new systems provide potential for the development of versatile tools for genome editing and regulation.

View Article and Find Full Text PDF

Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

PLoS Comput Biol

November 2016

National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland, United States of America.

The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature.

View Article and Find Full Text PDF

The key question of precision medicine is whether it is possible to find clinically actionable granularity in diagnosing disease and classifying patient risk. The advent of next-generation sequencing and the widespread adoption of electronic health records (EHRs) have provided clinicians and researchers a wealth of data and made possible the precise characterization of individual patient genotypes and phenotypes. Unstructured text-found in biomedical publications and clinical notes-is an important component of genotype and phenotype knowledge.

View Article and Find Full Text PDF

Overview of the interactive task in BioCreative V.

Database (Oxford)

November 2017

Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, 19711, USA Department of Computer and Information Sciences, University of Delaware, Newark, DE, 19711, USA

Fully automated text mining (TM) systems promote efficient literature searching, retrieval, and review but are not sufficient to produce ready-to-consume curated documents. These systems are not meant to replace biocurators, but instead to assist them in one or more literature curation steps. To do so, the user interface is an important aspect that needs to be considered for tool adoption.

View Article and Find Full Text PDF

The significant amount of medicinal chemistry information contained in patents makes them an attractive target for text mining. In this manuscript, we describe systems for named entity recognition (NER) of chemicals and genes/proteins in patents, using the CEMP (for chemicals) and GPRO (for genes/proteins) corpora provided by the CHEMDNER task at BioCreative V. Our chemical NER system is an ensemble of five open systems, including both versions of tmChem, our previous work on chemical NER.

View Article and Find Full Text PDF

In genomics, bioinformatics and other areas of data science, gaps exist between extant public datasets and the open-source software tools built by the community to analyze similar data types.  The purpose of biological data science hackathons is to assemble groups of genomics or bioinformatics professionals and software developers to rapidly prototype software to address these gaps.  The only two rules for the NCBI-assisted hackathons run so far are that 1) data either must be housed in public data repositories or be deposited to such repositories shortly after the hackathon's conclusion, and 2) all software comprising the final pipeline must be open-source or open-use.

View Article and Find Full Text PDF

Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature.

J Am Med Inform Assoc

July 2016

National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health, Bethesda, MD, USA

Objective: Identifying disease-mutation relationships is a significant challenge in the advancement of precision medicine. The aim of this work is to design a tool that automates the extraction of disease-related mutations from biomedical text to advance database curation for the support of precision medicine.

Materials And Methods: We developed a machine-learning (ML) based method to automatically identify the mutations mentioned in the biomedical literature related to a particular disease.

View Article and Find Full Text PDF

Point Mutations in Exon 1B of APC Reveal Gastric Adenocarcinoma and Proximal Polyposis of the Stomach as a Familial Adenomatous Polyposis Variant.

Am J Hum Genet

May 2016

Department of Genetics and Computational Biology, QIMR Berghofer, Herston, QLD 4029, Australia. Electronic address:

Gastric adenocarcinoma and proximal polyposis of the stomach (GAPPS) is an autosomal-dominant cancer-predisposition syndrome with a significant risk of gastric, but not colorectal, adenocarcinoma. We mapped the gene to 5q22 and found loss of the wild-type allele on 5q in fundic gland polyps from affected individuals. Whole-exome and -genome sequencing failed to find causal mutations but, through Sanger sequencing, we identified point mutations in APC promoter 1B that co-segregated with disease in all six families.

View Article and Find Full Text PDF

Discovering biomedical semantic relations in PubMed queries for information retrieval and database curation.

Database (Oxford)

October 2016

National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA

Identifying relevant papers from the literature is a common task in biocuration. Most current biomedical literature search systems primarily rely on matching user keywords. Semantic search, on the other hand, seeks to improve search accuracy by understanding the entities and contextual relations in user keywords.

View Article and Find Full Text PDF

Unlabelled: The biomedical literature is a knowledge-rich resource and an important foundation for future research. With over 24 million articles in PubMed and an increasing growth rate, research in automated text processing is becoming increasingly important. We report here our recently developed web-based text mining services for biomedical concept recognition and normalization.

View Article and Find Full Text PDF

Reorganization and expansion of the nidoviral family Arteriviridae.

Arch Virol

March 2016

Department of Pathobiological Sciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, WI, 53706, USA.

The family Arteriviridae presently includes a single genus Arterivirus. This genus includes four species as the taxonomic homes for equine arteritis virus (EAV), lactate dehydrogenase-elevating virus (LDV), porcine respiratory and reproductive syndrome virus (PRRSV), and simian hemorrhagic fever virus (SHFV), respectively. A revision of this classification is urgently needed to accommodate the recent description of eleven highly divergent simian arteriviruses in diverse African nonhuman primates, one novel arterivirus in an African forest giant pouched rat, and a novel arterivirus in common brushtails in New Zealand.

View Article and Find Full Text PDF

The automatic recognition of gene names and their associated database identifiers from biomedical text has been widely studied in recent years, as these tasks play an important role in many downstream text-mining applications. Despite significant previous research, only a small number of tools are publicly available and these tools are typically restricted to detecting only mention level gene names or only document level gene identifiers. In this work, we report GNormPlus: an end-to-end and open source system that handles both gene mention and identifier detection.

View Article and Find Full Text PDF

Secondary structural entropy in RNA switch (Riboswitch) identification.

BMC Bioinformatics

April 2015

Institute of Bioinformatics, University of Georgia, Davison Life Sciences Bldg, Room B118B, 120 Green St, Athens, 30602, USA.

Background: RNA regulatory elements play a significant role in gene regulation. Riboswitches, a widespread group of regulatory RNAs, are vital components of many bacterial genomes. These regulatory elements generally function by forming a ligand-induced alternative fold that controls access to ribosome binding sites or other regulatory sites in RNA.

View Article and Find Full Text PDF

The CHEMDNER corpus of chemicals and drugs and its annotation principles.

J Cheminform

March 2015

Structural Computational Biology Group, Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre, Madrid, Spain.

The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents.

View Article and Find Full Text PDF