Many datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While biospecimen and experimental information is often captured, detailed metadata standards related to data matrices and analysis workflows are currently lacking. To address this, we develop the matrix and analysis metadata standards (MAMS) to serve as a resource for data centers, repositories, and tool developers.
View Article and Find Full Text PDFA large number of genomic and imaging datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While much effort has been devoted to capturing information related to biospecimen information and experimental procedures, the metadata standards that describe data matrices and the analysis workflows that produced them are relatively lacking. Detailed metadata schema related to data analysis are needed to facilitate sharing and interoperability across groups and to promote data provenance for reproducibility.
View Article and Find Full Text PDFThe Common Fund Data Ecosystem (CFDE) has created a flexible system of data federation that enables researchers to discover datasets from across the US National Institutes of Health Common Fund without requiring that data owners move, reformat, or rehost those data. This system is centered on a catalog that integrates detailed descriptions of biomedical datasets from individual Common Fund Programs' Data Coordination Centers (DCCs) into a uniform metadata model that can then be indexed and searched from a centralized portal. This Crosscut Metadata Model (C2M2) supports the wide variety of data types and metadata terms used by individual DCCs and can readily describe nearly all forms of biomedical research data.
View Article and Find Full Text PDFScalable technologies to sequence the transcriptomes and epigenomes of single cells are transforming our understanding of cell types and cell states. The Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative Cell Census Network (BICCN) is applying these technologies at unprecedented scale to map the cell types in the mammalian brain. In an effort to increase data FAIRness (Findable, Accessible, Interoperable, Reusable), the NIH has established repositories to make data generated by the BICCN and related BRAIN Initiative projects accessible to the broader research community.
View Article and Find Full Text PDFThe Human Microbiome Project (HMP) explored microbial communities of the human body in both healthy and disease states. Two phases of the HMP (HMP and iHMP) together generated >48TB of data (public and controlled access) from multiple, varied omics studies of both the microbiome and associated hosts. The Human Microbiome Project Data Coordination Center (HMPDACC) was established to provide a portal to access data and resources produced by the HMP.
View Article and Find Full Text PDFInactivation of Ataxia-telangiectasia mutated (ATM) gene results in an increased risk to develop cancer. We show that ATM deficiency in diffuse large B-cell lymphoma (DLBCL) significantly induce mitochondrial deacetylase sirtuin-3 (SIRT3) activity, disrupted mitochondrial structure, decreased mitochondrial respiration, and compromised TCA flux compared with DLBCL cells expressing wild type (WT)-ATM. This corresponded to enrichment of glutamate receptor and glutamine pathways in ATM deficient background compared to WT-ATM DLBCL cells.
View Article and Find Full Text PDFThe characterization of baseline microbial and functional diversity in the human microbiome has enabled studies of microbiome-related disease, diversity, biogeography, and molecular function. The National Institutes of Health Human Microbiome Project has provided one of the broadest such characterizations so far. Here we introduce a second wave of data from the study, comprising 1,631 new metagenomes (2,355 total) targeting diverse body sites with multiple time points in 265 individuals.
View Article and Find Full Text PDFVariation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis.
View Article and Find Full Text PDFThe recently introduced bacterial species Chlamydia gallinacea is known to occur in domestic poultry and other birds. Its potential as an avian pathogen and zoonotic agent is under investigation. The whole-genome sequence of its type strain, 08-1274/3, consists of a 1,059,583-bp chromosome with 914 protein-coding sequences (CDSs) and a plasmid (p1274) comprising 7,619 bp with 9 CDSs.
View Article and Find Full Text PDFUnlabelled: It is not currently possible to predict the probability of whether a woman with a chlamydial genital infection will develop pelvic inflammatory disease (PID). To determine if specific biomarkers may be associated with distinct chlamydial pathotypes, we utilized two Chlamydia muridarum variants (C. muridarum Var001 [CmVar001] and CmVar004) that differ in their abilities to elicit upper genital tract pathology in a mouse model.
View Article and Find Full Text PDFThe family Chlamydiaceae with the recombined single genus Chlamydia currently comprises nine species, all of which are obligate intracellular organisms distinguished by a unique biphasic developmental cycle. Anecdotal evidence from epidemiological surveys in flocks of poultry, pigeons and psittacine birds have indicated the presence of non-classified chlamydial strains, some of which may act as pathogens. In the present study, phylogenetic analysis of ribosomal RNA and ompA genes, as well as multi-locus sequence analysis of 11 field isolates were conducted.
View Article and Find Full Text PDFBacillus megaterium is deep-rooted in the Bacillus phylogeny, making it an evolutionarily key species and of particular importance in understanding genome evolution, dynamics, and plasticity in the bacilli. B. megaterium is a commercially available, nonpathogenic host for the biotechnological production of several substances, including vitamin B(12), penicillin acylase, and amylases.
View Article and Find Full Text PDFThe Institute for Genome Sciences (IGS) has developed a prokaryotic annotation pipeline that is used for coding gene/RNA prediction and functional annotation of Bacteria and Archaea. The fully automated pipeline accepts one or many genomic sequences as input and produces output in a variety of standard formats. Functional annotation is primarily based on similarity searches and motif finding combined with a hierarchical rule based annotation system.
View Article and Find Full Text PDFChlamydia psittaci is a highly prevalent avian pathogen and the cause of a potentially lethal zoonosis, causing life-threatening pneumonia in humans. We report the genome sequences of C. psittaci 6BC, the prototype strain of the species, and C.
View Article and Find Full Text PDFThe human microbiome refers to the community of microorganisms, including prokaryotes, viruses, and microbial eukaryotes, that populate the human body. The National Institutes of Health launched an initiative that focuses on describing the diversity of microbial species that are associated with health and disease. The first phase of this initiative includes the sequencing of hundreds of microbial reference genomes, coupled to metagenomic sequencing from multiple body sites.
View Article and Find Full Text PDF