Genome assembly of short reads from large plant genomes remains a challenge in computational biology despite major developments in next generation sequencing. Of late several draft assemblies have been reported in sequenced plant genomes. The reported draft genome assemblies of have different levels of genome completeness, a large number of repeats, gaps, and segmental duplications. Draft assemblies with portions of genome missing are shorter than the referenced original genome. These assemblies come with low map accuracy affecting further functional annotation and the prediction of gene components as desired by crop researchers. Genome coverage, i.e., the number of sequenced raw reads mapped onto a certain location of the genome is an important quality indicator of completeness and assembly quality in draft assemblies. The present work aimed to improve the coverage in reported sequenced draft genomes (GCA_000340665.1 and GCA_000230855.2) of pigeonpea, a legume widely cultivated in India. The two recently sequenced assemblies, A1 and A2 comprised 72% and 75% of the estimated coverage of the genome, respectively. We employed an assembly reconciliation approach to compare the draft assemblies and merge them, filling the gaps by employing an algorithm size sorting mate-pair library to generate a high quality and near complete assembly with enhanced contiguity. The majority of gaps present within scaffolds were filled with right-sized mate-pair reads. The improved assembly reduced the number of gaps than those reported in draft assemblies resulting in an improved genome coverage of 82.4%. Map accuracy of the improved assembly was evaluated using various quality metrics and for the presence of specific trait-related functional genes. Employed pair-end and mate-pair local libraries helped us to reduce gaps, repeats, and other sequence errors resulting in lengthier scaffolds compared to the two draft assemblies. We reported the prediction of putative host resistance genes against wilt disease by their performance and evaluated them both in wet laboratory and field phenotypic conditions.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7770131 | PMC |
http://dx.doi.org/10.3389/fgene.2020.607432 | DOI Listing |
Microbiol Resour Announc
December 2024
Entheome Foundation, Oakland, California, USA.
is a genus of mushroom-forming fungi with ecological, ethnomycological, and clinical importance due to psilocybin production by most species. We present five genomes that enable deeper discovery and analysis of the psilocybin gene cluster and increase taxonomic resolution within : , nom. prov.
View Article and Find Full Text PDFMicrobiol Resour Announc
December 2024
Wal-Yan Respiratory Research Centre, Telethon Kids Institute, Perth, Western Australia, Australia.
We present lung virome data recovered through shotgun metagenomics in bronchoalveolar lavage fluid from an infant with cystic fibrosis, who tested positive for infection. Using a bioinformatic pipeline for virus characterization in shotgun metagenomic data, we identified five viral contigs representing Pseudomonas phages classified as Caudoviricetes.
View Article and Find Full Text PDFMicrobiol Resour Announc
December 2024
NGS Core Facility, Kyungpook National University, Daegu, South Korea.
The draft genome sequence of CHEE98, isolated from cheese, is presented here. The whole-genome assembly is 2.7 Mbp in size, with a GC content of 45.
View Article and Find Full Text PDFSci Data
December 2024
Key Laboratory of Wetland Ecology and Environment & Heilongjiang Xingkai Lake Wetland Ecosystem National Observation and Research Station, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China.
Given the aggressive nature and robust survival capabilities of the alligator gar (Atractosteus spatula), if it was to exist in a new environment as an invasive species, it could cause significant disruption to the invaded ecosystem. Building on the continuity and completeness of the existing draft genome were not optimal, this study has updated a high-quality genome of the alligator gar at the chromosome level, which was assembled using Oxford Nanopore Technology and chromatin interaction mapping (Hi-C) sequencing techniques. In summary, the alligator gar genome in this study was 1.
View Article and Find Full Text PDFDiabetologia
December 2024
Key Laboratory of Human Functional Genomics of Jiangsu Province, Nanjing Medical University, Nanjing, Jiangsu, China.
Aims/hypothesis: The key pancreatic beta cell transcription factor v-maf musculoaponeurotic fibrosarcoma oncogene homologue A (MafA) is critical for the maintenance of mature beta cell function and phenotype. The expression levels and/or activities of MafA are reduced when beta cells are chronically exposed to diabetogenic stress, such as hyperglycaemia (i.e.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!