Reference sequences and annotations serve as the foundation for many lines of research today, from organism and sequence identification to providing a core description of the genes, transcripts and proteins found in an organism's genome. Interpretation of data including transcriptomics, proteomics, sequence variation and comparative analyses based on reference gene annotations informs our understanding of gene function and possible disease mechanisms, leading to new biomedical discoveries. The Reference Sequence (RefSeq) resource created at the National Center for Biotechnology Information (NCBI) leverages both automatic processes and expert curation to create a robust set of reference sequences of genomic, transcript and protein data spanning the tree of life.
View Article and Find Full Text PDFComprehensive genome annotation is essential to understand the impact of clinically relevant variants. However, the absence of a standard for clinical reporting and browser display complicates the process of consistent interpretation and reporting. To address these challenges, Ensembl/GENCODE and RefSeq launched a joint initiative, the Matched Annotation from NCBI and EMBL-EBI (MANE) collaboration, to converge on human gene and transcript annotation and to jointly define a high-value set of transcripts and corresponding proteins.
View Article and Find Full Text PDFFemale Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science.
View Article and Find Full Text PDFAspergillus flavus is a saprophytic fungus that infects corn, peanuts, tree nuts and other agriculturally important crops. Once the crop is infected the fungus has the potential to secrete one or more mycotoxins, the most carcinogenic of which is aflatoxin. Aflatoxin contaminated crops are deemed unfit for human or animal consumption, which results in both food and economic losses.
View Article and Find Full Text PDFThe Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID).
View Article and Find Full Text PDFTicks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases.
View Article and Find Full Text PDFThe RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.
View Article and Find Full Text PDFComplete and accurate annotation of the mouse genome is critical to the advancement of research conducted on this important model organism. The National Center for Biotechnology Information (NCBI) develops and maintains many useful resources to assist the mouse research community. In particular, the reference sequence (RefSeq) database provides high-quality annotation of multiple mouse genome assemblies using a combinatorial approach that leverages computation, manual curation, and collaboration.
View Article and Find Full Text PDFInvasive aspergillosis (IA) due to Aspergillus fumigatus is a major cause of mortality in immunocompromised patients. The discovery of highly fertile strains of A. fumigatus opened the possibility to merge classical and contemporary genetics to address key questions about this pathogen.
View Article and Find Full Text PDFThe soil fungus Rhizoctonia solani is a pathogen of agricultural crops. Here, we report on the 51,705,945 bp draft consensus genome sequence of R. solani strain Rhs1AP.
View Article and Find Full Text PDFWe utilized RNAseq analysis of the Aspergillus fumigatus response to early hypoxic condition exposure. The results show that more than 89% of the A. fumigatus genome is expressed under normoxic and hypoxic conditions.
View Article and Find Full Text PDFThe soil fungus Rhizoctonia solani is an economically important pathogen of agricultural and forestry crops. Here, we present the complete sequence and analysis of the mitochondrial genome of R. solani, field isolate Rhs1AP.
View Article and Find Full Text PDFBackground: The genera Aspergillus and Penicillium include some of the most beneficial as well as the most harmful fungal species such as the penicillin-producer Penicillium chrysogenum and the human pathogen Aspergillus fumigatus, respectively. Their mitochondrial genomic sequences may hold vital clues into the mechanisms of their evolution, population genetics, and biology, yet only a handful of these genomes have been fully sequenced and annotated.
Results: Here we report the complete sequence and annotation of the mitochondrial genomes of six Aspergillus and three Penicillium species: A.
We present the draft genome for the Rickettsia endosymbiont of Ixodes scapularis (REIS), a symbiont of the deer tick vector of Lyme disease in North America. Among Rickettsia species (Alphaproteobacteria: Rickettsiales), REIS has the largest genome sequenced to date (>2 Mb) and contains 2,309 genes across the chromosome and four plasmids (pREIS1 to pREIS4). The most remarkable finding within the REIS genome is the extraordinary proliferation of mobile genetic elements (MGEs), which contributes to a limited synteny with other Rickettsia genomes.
View Article and Find Full Text PDFBackground: Ichthyophthirius multifiliis, commonly known as Ich, is a highly pathogenic ciliate responsible for 'white spot', a disease causing significant economic losses to the global aquaculture industry. Options for disease control are extremely limited, and Ich's obligate parasitic lifestyle makes experimental studies challenging. Unlike most well-studied protozoan parasites, Ich belongs to a phylum composed primarily of free-living members.
View Article and Find Full Text PDFAs an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb.
View Article and Find Full Text PDFThe identification and annotation of protein-coding genes is one of the primary goals of whole-genome sequencing projects, and the accuracy of predicting the primary protein products of gene expression is vital to the interpretation of the available data and the design of downstream functional applications. Nevertheless, the comprehensive annotation of eukaryotic genomes remains a considerable challenge. Many genomes submitted to public databases, including those of major model organisms, contain significant numbers of wrong and incomplete gene predictions.
View Article and Find Full Text PDFIndustrial penicillin production with the filamentous fungus Penicillium chrysogenum is based on an unprecedented effort in microbial strain improvement. To gain more insight into penicillin synthesis, we sequenced the 32.19 Mb genome of P.
View Article and Find Full Text PDFUnderstanding the nature of species" boundaries is a fundamental question in evolutionary biology. The availability of genomes from several species of the genus Aspergillus allows us for the first time to examine the demarcation of fungal species at the whole-genome level. Here, we examine four case studies, two of which involve intraspecific comparisons, whereas the other two deal with interspecific genomic comparisons between closely related species.
View Article and Find Full Text PDFWe present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable and up to 2% unique genes in each genome.
View Article and Find Full Text PDFThe ability of Pseudomonas syringae pv. phaseolicola to cause halo blight of bean is dependent on its ability to translocate effector proteins into host cells via the hypersensitive response and pathogenicity (Hrp) type III secretion system (T3SS). To identify genes encoding type III effectors and other potential virulence factors that are regulated by the HrpL alternative sigma factor, we used a hidden Markov model, weight matrix model, and type III targeting-associated patterns to search the genome of P.
View Article and Find Full Text PDFThe availability of the genome sequences of multiple Aspergillus spp. presents the research community with an unprecedented opportunity for discovery. The genomes of Neosartorya fischeri and Aspergillus clavatus have been sequenced in order to extend our knowledge of Aspergillus fumigatus, the primary cause of invasive aspergillosis.
View Article and Find Full Text PDFMany plant pathogens suppress antimicrobial defenses using virulence factors that modulate endogenous host defenses. The Pseudomonas syringae phytotoxin coronatine (COR) is believed to promote virulence by acting as a jasmonate analog, because COR-insensitive 1 (coil) Arabidopsis thaliana and tomato mutants are impaired in jasmonate signaling and exhibit reduced susceptibility to P. syringae.
View Article and Find Full Text PDFCyanobacterium Nostoc commune can tolerate the simultaneous stresses of desiccation, UV irradiation, and oxidation. Acidic WspA, of approximately 33.6 kDa, is secreted to the three-dimensional extracellular matrix and accounts for greater than 70% of the total soluble protein.
View Article and Find Full Text PDFPseudomonas syringae pv. phaseolicola, a gram-negative bacterial plant pathogen, is the causal agent of halo blight of bean. In this study, we report on the genome sequence of P.
View Article and Find Full Text PDF