Publications by James S Song | LitMetric

Publications by authors named "James S Song"

Page 1 of 1

The conserved domain database in 2023.

Jiyao Wang Farideh Chitsaz Myra K Derbyshire Noreen R Gonzales Marc Gwadz James S Song

Nucleic Acids Res

January 2023

NLM's conserved domain database (CDD) is a collection of protein domain and protein family models constructed as multiple sequence alignments. Its main purpose is to provide annotation for protein and translated nucleotide sequences with the location of domain footprints and associated functional sites, and to define protein domain architecture as a basis for assigning gene product names and putative/predicted function. CDD has been available publicly for over 20 years and has grown substantially during that time.

View Article and Find Full Text PDF

RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation.

Wenjun Li Kathleen R O'Neill Daniel H Haft Michael DiCuccio Vyacheslav Chetvernin James S Song

Nucleic Acids Res

January 2021

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains nearly 200 000 bacterial and archaeal genomes and 150 million proteins with up-to-date annotation. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP) since 2018 have resulted in a substantial reduction in spurious annotation. The hierarchical collection of protein family models (PFMs) used by PGAP as evidence for structural and functional annotation was expanded to over 35 000 protein profile hidden Markov models (HMMs), 12 300 BlastRules and 36 000 curated CDD architectures.

View Article and Find Full Text PDF

CDD/SPARCLE: the conserved domain database in 2020.

Shennan Lu Jiyao Wang Farideh Chitsaz Myra K Derbyshire Renata C Geer James S Song

Nucleic Acids Res

January 2020

As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, CDD curation staff continues to develop hierarchical classifications of widely distributed protein domain families, and to record conserved sites associated with molecular function, so that they can be mapped onto user queries in support of hypothesis-driven biomolecular research. CDD offers both an archive of pre-computed domain annotations as well as live search services for both single protein or nucleotide queries and larger sets of protein query sequences. CDD staff has continued to characterize protein families via conserved domain architectures and has built up a significant corpus of curated domain architectures in support of naming bacterial proteins in RefSeq.

View Article and Find Full Text PDF

RefSeq: an update on prokaryotic genome annotation and curation.

Daniel H Haft Michael DiCuccio Azat Badretdin Vyacheslav Brover Vyacheslav Chetvernin James S Song

Nucleic Acids Res

January 2018

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) provides annotation for over 95 000 prokaryotic genomes that meet standards for sequence quality, completeness, and freedom from contamination. Genomes are annotated by a single Prokaryotic Genome Annotation Pipeline (PGAP) to provide users with a resource that is as consistent and accurate as possible. Notable recent changes include the development of a hierarchical evidence scheme, a new focus on curating annotation evidence sources, the addition and curation of protein profile hidden Markov models (HMMs), release of an updated pipeline (PGAP-4), and comprehensive re-annotation of RefSeq prokaryotic genomes.

View Article and Find Full Text PDF

CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.

Aron Marchler-Bauer Yu Bo Lianyi Han Jane He Christopher J Lanczycki James S Song

Nucleic Acids Res

January 2017

NCBI's Conserved Domain Database (CDD) aims at annotating biomolecular sequences with the location of evolutionarily conserved protein domain footprints, and functional sites inferred from such footprints. An archive of pre-computed domain annotation is maintained for proteins tracked by NCBI's Entrez database, and live search services are offered as well. CDD curation staff supplements a comprehensive collection of protein domain and protein family models, which have been imported from external providers, with representations of selected domain families that are curated in-house and organized into hierarchical classifications of functionally distinct families and sub-families.

View Article and Find Full Text PDF

CDD: NCBI's conserved domain database.

Aron Marchler-Bauer Myra K Derbyshire Noreen R Gonzales Shennan Lu Farideh Chitsaz James S Song

Nucleic Acids Res

January 2015

NCBI's CDD, the Conserved Domain Database, enters its 15(th) year as a public resource for the annotation of proteins with the location of conserved domain footprints. Going forward, we strive to improve the coverage and consistency of domain annotation provided by CDD. We maintain a live search system as well as an archive of pre-computed domain annotation for sequences tracked in NCBI's Entrez protein database, which can be retrieved for single sequences or in bulk.

View Article and Find Full Text PDF

CDD: conserved domains and protein three-dimensional structure.

Aron Marchler-Bauer Chanjuan Zheng Farideh Chitsaz Myra K Derbyshire Lewis Y Geer James S Song

Nucleic Acids Res

January 2013

CDD, the Conserved Domain Database, is part of NCBI's Entrez query and retrieval system and is also accessible via http://www.ncbi.nlm.

View Article and Find Full Text PDF

CDD: a Conserved Domain Database for the functional annotation of proteins.

Aron Marchler-Bauer Shennan Lu John B Anderson Farideh Chitsaz Myra K Derbyshire James S Song

Nucleic Acids Res

January 2011

NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent.

View Article and Find Full Text PDF

CDD: specific functional annotation with the Conserved Domain Database.

Aron Marchler-Bauer John B Anderson Farideh Chitsaz Myra K Derbyshire Carol DeWeese-Scott James S Song

Nucleic Acids Res

January 2009

NCBI's Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution. The collection can be accessed at http://www.ncbi.

View Article and Find Full Text PDF

CDD: a conserved domain database for interactive domain family analysis.

Aron Marchler-Bauer John B Anderson Myra K Derbyshire Carol DeWeese-Scott Noreen R Gonzales James S Song

Nucleic Acids Res

January 2007

The conserved domain database (CDD) is part of NCBI's Entrez database system and serves as a primary resource for the annotation of conserved domain footprints on protein sequences in Entrez. Entrez's global query interface can be accessed at http://www.ncbi.

View Article and Find Full Text PDF

CDD: a Conserved Domain Database for protein classification.

Aron Marchler-Bauer John B Anderson Praveen F Cherukuri Carol DeWeese-Scott Lewis Y Geer James S Song

Nucleic Acids Res

January 2005

The Conserved Domain Database (CDD) is the protein classification component of NCBI's Entrez query and retrieval system. CDD is linked to other Entrez databases such as Proteins, Taxonomy and PubMed, and can be accessed at http://www.ncbi.

View Article and Find Full Text PDF

Inhibition of tumor angiogenesis in vivo by a monoclonal antibody targeted to domain 5 of high molecular weight kininogen.

James S Song Irma M Sainz Stephen C Cosenza Irma Isordia-Salas Abdel Bior

Blood

October 2004

We have shown that human high molecular weight kininogen is proangiogenic due to release of bradykinin. We now determined the ability of a murine monoclonal antibody to the light chain of high molecular weight kininogen, C11C1, to inhibit tumor growth compared to isotype-matched murine IgG. Monoclonal antibody C11C1 efficiently blocks binding of high molecular weight kininogen to endothelial cells in a concentration-dependent manner.

View Article and Find Full Text PDF

MMDB: Entrez's 3D-structure database.

Jie Chen John B Anderson Carol DeWeese-Scott Natalie D Fedorova Lewis Y Geer James S Song

Nucleic Acids Res

January 2003

Three-dimensional structures are now known within most protein families and it is likely, when searching a sequence database, that one will identify a homolog of known structure. The goal of Entrez's 3D-structure database is to make structure information and the functional annotation it can provide easily accessible to molecular biologists. To this end, Entrez's search engine provides several powerful features: (i) links between databases, for example between a protein's sequence and structure; (ii) pre-computed sequence and structure neighbors; and (iii) structure and sequence/structure alignment visualization.

View Article and Find Full Text PDF

CDD: a curated Entrez database of conserved domain alignments.

Aron Marchler-Bauer John B Anderson Carol DeWeese-Scott Natalie D Fedorova Lewis Y Geer James S Song

Nucleic Acids Res

January 2003

The Conserved Domain Database (CDD) is now indexed as a separate database within the Entrez system and linked to other Entrez databases such as MEDLINE(R). This allows users to search for domain types by name, for example, or to view the domain architecture of any protein in Entrez's sequence database. CDD can be accessed on the WorldWideWeb at http://www.

View Article and Find Full Text PDF

MMDB: Entrez's 3D-structure database.

Yanli Wang John B Anderson Jie Chen Lewis Y Geer Siqian He James S Song

Nucleic Acids Res

January 2002

Three-dimensional structures are now known within many protein families and it is quite likely, in searching a sequence database, that one will encounter a homolog with known structure. The goal of Entrez's 3D-structure database is to make this information, and the functional annotation it can provide, easily accessible to molecular biologists. To this end Entrez's search engine provides three powerful features.

View Article and Find Full Text PDF