Background: The application of reduced metagenomic sequencing approaches holds promise as a middle ground between targeted amplicon sequencing and whole metagenome sequencing approaches but has not been widely adopted as a technique. A major barrier to adoption is the lack of read simulation software built to handle characteristic features of these novel approaches. Reduced metagenomic sequencing (RMS) produces unique patterns of fragmentation per genome that are sensitive to restriction enzyme choice, and the non-uniform size selection of these fragments may introduce novel challenges to taxonomic assignment as well as relative abundance estimates.
Results: Through the development and application of simulation software, readsynth, we compare simulated metagenomic sequencing libraries with existing RMS data to assess the influence of multiple library preparation and sequencing steps on downstream analytical results. Based on read depth per position, readsynth achieved 0.79 Pearson's correlation and 0.94 Spearman's correlation to these benchmarks. Application of a novel estimation approach, fixed length taxonomic ratios, improved quantification accuracy of simulated human gut microbial communities when compared to estimates of mean or median coverage.
Conclusions: We investigate the possible strengths and weaknesses of applying the RMS technique to profiling microbial communities via simulations with readsynth. The choice of restriction enzymes and size selection steps in library prep are non-trivial decisions that bias downstream profiling and quantification. The simulations investigated in this study illustrate the possible limits of preparing metagenomic libraries with a reduced representation sequencing approach, but also allow for the development of strategies for producing and handling the sequence data produced by this promising application.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095026 | PMC |
http://dx.doi.org/10.1186/s12859-024-05809-3 | DOI Listing |
Proteins have proven to be useful agents in a variety of fields, from serving as potent therapeutics to enabling complex catalysis for chemical manufacture. However, they remain difficult to design and are instead typically selected for using extensive screens or directed evolution. Recent developments in protein large language models have enabled fast generation of diverse protein sequences in unexplored regions of protein space predicted to fold into varied structures, bind relevant targets, and catalyze novel reactions.
View Article and Find Full Text PDFCancer cells within tumors exhibit a wide range of phenotypic states driven by non-genetic mechanisms in addition to extensively studied genetic alterations. Conversions among cancer cell states can result in intratumoral heterogeneity which contributes to metastasis and development of drug resistance. However, mechanisms underlying the initiation and/or maintenance of such phenotypic plasticity are poorly understood.
View Article and Find Full Text PDFThe maintenance of a healthy epithelial-endothelial juxtaposition requires cross-talk within glomerular cellular niches. We sought to understand the spatially-anchored regulation and transition of endothelial and mesangial cells from health to injury in DKD. From 74 human kidney samples, an integrated multi-omics approach was leveraged to identify cellular niches, cell-cell communication, cell injury trajectories, and regulatory transcription factor (TF) networks in glomerular capillary endothelial (EC-GC) and mesangial cells.
View Article and Find Full Text PDFUnlabelled: Proteins commonly self-assemble to create liquid or solid condensates with diverse biological activities. The mechanisms of assembly are determined by each protein's sequence and cellular context. We previously developed distributed amphifluoric FRET (DAmFRET) to analyze sequence determinants of self-assembly in cells.
View Article and Find Full Text PDFGene expression is coordinated by a multitude of transcription factors (TFs), whose binding to the genome is directed through multiple interconnected epigenetic signals, including chromatin accessibility and histone modifications. These complex networks have been shown to be disrupted during aging, disease, and cancer. However, profiling these networks across diverse cell types and states has been limited due to the technical constraints of existing methods for mapping DNA:Protein interactions in single cells.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!