Whole-genome sequencing of non-model organisms is now widely accessible and has allowed a range of questions in the field of molecular ecology to be investigated with greater power. However, some genomic regions that are of high biological interest remain problematic for assembly and data-handling. Three such regions are the major histocompatibility complex (MHC), sex-determining regions (SDRs) and the plant self-incompatibility locus (S-locus). Using these as examples, we illustrate the challenges of both assembling and resequencing these highly polymorphic regions and how bioinformatic and technological developments are enabling new approaches to their study. Mapping short-read sequences against multiple alternative references improves genotyping comprehensiveness at the S-locus thereby contributing to more accurate assessments of allelic frequencies. Long-read sequencing, producing reads of several tens to hundreds of kilobase pairs in length, facilitates the assembly of such regions as single sequences can span the multiple duplicated gene copies of the MHC region, and sequence through repetitive stretches and translocations in SDRs and S-locus haplotypes. These advances are adding value to short-read genome resequencing approaches by allowing, for example, more accurate haplotype phasing across longer regions. Finally, we assessed further technical improvements, such as nanopore adaptive sequencing and bioinformatic tools using pangenomes, which have the potential to further expand our knowledge of a number of genomic regions that remain challenging to study with classical resequencing approaches.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9290700PMC
http://dx.doi.org/10.1111/mec.16020DOI Listing

Publication Analysis

Top Keywords

whole-genome sequencing
8
regions
8
major histocompatibility
8
histocompatibility complex
8
plant self-incompatibility
8
genomic regions
8
resequencing approaches
8
sequencing genome
4
genome regions
4
regions special
4

Similar Publications

Aim: Romania is currently facing a prolonged measles outbreak. The aim of the study was to analyse the circulating human measles virus (HMV) strains by combining whole genome sequencing (WGS) with phylogenetic analysis, with a focus on the haemagglutinin gene.

Methods: We conducted an observational study in the first five months of 2024, in which 168 patients diagnosed with measles were randomly included.

View Article and Find Full Text PDF

Genomic imprinting is an epigenetic process that results in parent-of-origin effects on mammalian development and growth. Research on genomic imprinting in domesticated animals has lagged due to a primary focus on orthologs of mouse and human imprinted genes. This emphasis has limited the discovery of imprinted genes specific to livestock.

View Article and Find Full Text PDF

Importance: Treatment to lower high levels of low-density lipoprotein cholesterol (LDL-C) reduces incident coronary artery disease (CAD) risk but modestly increases the risk for incident type 2 diabetes (T2D). The extent to which genetic factors across the cholesterol spectrum are associated with incident T2D is not well understood.

Objective: To investigate the association of genetic predisposition to increased LDL-C levels with incident T2D risk.

View Article and Find Full Text PDF

Decontamination of DNA sequences from a Streptomyces genome for optimal genome mining.

Braz J Microbiol

January 2025

Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo (USP), São Paulo, SP, 05508-900, Brazil.

Despite meticulous precautions, contamination of genomic DNA samples is not uncommon, which can significantly compromise the analysis of microorganisms' whole-genome sequencing data, thus affecting all subsequent analyses. Thanks to advancements in software and bioinformatics techniques, it is now possible to address this issue and prevent the loss of the entire dataset obtained in a contaminated whole-genome sequencing, where the DNA of another bacterium is present. In this study, it was observed that the sequencing reads from Streptomyces sp.

View Article and Find Full Text PDF

Reduced function or hypomorphic variants in recombination-activating genes (RAG) 1 or 2 result in a broad clinical phenotype including common variable immunodeficiency (CVID) and even adult-onset disease. Milder RAG variants are less characterized. Here we describe the longitudinal course of a milder combined RAG deficiency in 3 of 7 siblings sharing the same RAG2 mutations over a 50-year study.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!