Publications by Wratko Hlavina

Publications by authors named "Wratko Hlavina"

Page 1 of 1

NCBI RefSeq: reference sequence standards through 25 years of curation and annotation.

Tamara Goldfarb Vamsi K Kodali Shashikant Pujar Vyacheslav Brover Barbara Robbertse Wratko Hlavina

Nucleic Acids Res

January 2025

Reference sequences and annotations serve as the foundation for many lines of research today, from organism and sequence identification to providing a core description of the genes, transcripts and proteins found in an organism's genome. Interpretation of data including transcriptomics, proteomics, sequence variation and comparative analyses based on reference gene annotations informs our understanding of gene function and possible disease mechanisms, leading to new biomedical discoveries. The Reference Sequence (RefSeq) resource created at the National Center for Biotechnology Information (NCBI) leverages both automatic processes and expert curation to create a robust set of reference sequences of genomic, transcript and protein data spanning the tree of life.

View Article and Find Full Text PDF

Exploring and retrieving sequence and metadata for species across the tree of life with NCBI Datasets.

Nuala A O'Leary Eric Cox J Bradley Holmes W Ray Anderson Robert Falk Wratko Hlavina

Sci Data

July 2024

To explore complex biological questions, it is often necessary to access various data types from public data repositories. As the volume and complexity of biological sequence data grow, public repositories face significant challenges in ensuring that the data is easily discoverable and usable by the biological research community. To address these challenges, the National Center for Biotechnology Information (NCBI) has created NCBI Datasets.

View Article and Find Full Text PDF

Rapid and sensitive detection of genome contamination at scale with FCS-GX.

Alexander Astashyn Eric S Tvedte Deacon Sweeney Victor Sapojnikov Nathan Bouk Wratko Hlavina

Genome Biol

February 2024

Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.

View Article and Find Full Text PDF

Rapid and sensitive detection of genome contamination at scale with FCS-GX.

Alexander Astashyn Eric S Tvedte Deacon Sweeney Victor Sapojnikov Nathan Bouk Wratko Hlavina

bioRxiv

June 2023

Article Synopsis

FCS-GX is a new tool developed by NCBI to quickly identify and remove contamination from genomic sequences.
It efficiently screens genomes in a short time (0.1-10 minutes) and has high sensitivity (>95%) and specificity (>99.93%) for detecting various contaminant species.
The tool was used to analyze 1.6 million GenBank assemblies, uncovering 36.8 Gbp of contamination, which led to improved genome accuracy in NCBI's databases.

View Article and Find Full Text PDF

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.

Nuala A O'Leary Mathew W Wright J Rodney Brister Stacy Ciufo Diana Haddad Wratko Hlavina

Nucleic Acids Res

January 2016

The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.

View Article and Find Full Text PDF

Lineage-specific biology revealed by a finished genome assembly of the mouse.

Deanna M Church Leo Goodstadt Ladeana W Hillier Michael C Zody Steve Goldstein Wratko Hlavina

PLoS Biol

May 2009

The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly.

View Article and Find Full Text PDF

The genome sequence of taurine cattle: a window to ruminant biology and evolution.

Science

April 2009

Article Synopsis

- The cattle genome was sequenced to enhance the understanding of ruminant biology and evolution, containing at least 22,000 genes with 14,345 orthologs shared across seven mammal species.
- Certain regions in the cattle genome have a higher density of segmental duplications, indicating unique evolutionary changes, particularly in genes linked to lactation and immune responses.
- This genome sequence serves as a valuable resource for studying mammalian evolution and improving livestock genetics, which can lead to better milk and meat production.

View Article and Find Full Text PDF

The genome of the model beetle and pest Tribolium castaneum.

Nature

April 2008

Tribolium castaneum is a member of the most species-rich eukaryotic order, a powerful model organism for the study of generalized insect development, and an important pest of stored agricultural products. We describe its genome sequence here. This omnivorous beetle has evolved the ability to interact with a diverse chemical environment, as shown by large expansions in odorant and gustatory receptors, as well as P450 and other detoxification enzymes.

View Article and Find Full Text PDF

The genome of the sea urchin Strongylocentrotus purpuratus.

Science

November 2006

We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome.

View Article and Find Full Text PDF

Initial sequencing and comparative analysis of the mouse genome.

Nature

December 2002

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences.

View Article and Find Full Text PDF