Reduced representation sequencing (RRS) has proven to be a cost-effective solution for sequencing subsets of the genome in non-model species for large-scale studies. However, the targeted nature of RRS approaches commonly introduces large amounts of missing data, leading to reduced statistical power and biased estimates in downstream analyses. Genotype imputation, the statistical inference of missing sites across the genome, is a powerful alternative to overcome the caveats associated with missing sites. Typically, genotype imputation requires the presence of a reference panel of haplotypes, however, this is not always feasible for non-model species. In this issue of Molecular Ecology Resources, Mora-Márquez et al. (2024) develop gtImputation, an unsupervised machine learning imputation tool with an interactive GUI, which leverages information from the underlying data structure itself, without the need for a reference panel. They showcase that their method performs equally well and even surpasses existing haplotype-clustering and unsupervised machine learning algorithms, particularly for sites with low minor allele frequency (MAF) and for data sets with strong underlying population structure. This innovative framework adds to the ongoing efforts to expand the applicability of imputation to non-model species, offering the opportunity to apply varied types of analyses requiring dense sets of markers, while also maintaining lower sequencing costs.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1111/1755-0998.14066 | DOI Listing |
Mol Ecol Resour
January 2025
Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
Reduced representation sequencing (RRS) has proven to be a cost-effective solution for sequencing subsets of the genome in non-model species for large-scale studies. However, the targeted nature of RRS approaches commonly introduces large amounts of missing data, leading to reduced statistical power and biased estimates in downstream analyses. Genotype imputation, the statistical inference of missing sites across the genome, is a powerful alternative to overcome the caveats associated with missing sites.
View Article and Find Full Text PDFNat Commun
January 2025
Genetics, Bioinformatics, and Computational Biology, Virginia Tech, Blacksburg, VA, USA.
Single-cell RNA sequencing (scRNA-seq) is widely used in plant biology and is a powerful tool for studying cell identity and differentiation. However, the scarcity of known cell-type marker genes and the divergence of marker expression patterns limit the accuracy of cell-type identification and our capacity to investigate cell-type conservation in many species. To tackle this challenge, we devise a novel computational strategy called Orthologous Marker Gene Groups (OMGs), which can identify cell types in both model and non-model plant species and allows for rapid comparison of cell types across many published single-cell maps.
View Article and Find Full Text PDFSci Rep
January 2025
Key Laboratory of Grassland Ecosystem (Ministry of Education), Pratacultural College, Gansu Agricultural University, Lanzhou, 730070, China.
Microsatellite markers are cost-effective, rapid, efficient, and show great advantages in in large-sample kinship analysis and population structure studies. However, microsatellite loci are seriously underdeveloped in non-model organisms. The plateau zokor (Eospalax baileyi) is a key species living underground in the Tibetan Plateau, the effective management of which has long been challenging.
View Article and Find Full Text PDFInt J Biol Macromol
December 2024
Department of Food Science and Engineering, Moutai Institute, Renhuai 564507, China. Electronic address:
The microRNAs and phasiRNAs of plant are small non-coding RNAs with important functions through regulating gene expression at the post-transcriptional level. However, identifying miRNAs, phasiRNAs and their target genes from numerous sequencing raw data requires multiple software and command-line operations, which are time-consuming and labor-intensive for non-model plants. Therefore, we present CsMPDB (miRNAs and phasiRNAs database of Camellia sinensis), an interactive web application with multiple analysis modules developed to visualize and explore miRNA and phasiRNA in tea plants based on 259 sRNA-seq samples and 24 degradome-seq samples in NCBI.
View Article and Find Full Text PDFBMC Res Notes
December 2024
Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA.
Objective: Extracting DNA is essential in wildlife genetic studies, and numerous methods are available. However, the process is costly and time-consuming for non-model organisms, including most wildlife species. Therefore, we optimized a cost-efficient protocol to extract DNA from the muscle tissue of White-tailed Deer using the DNAdvance kit (Beckman Coulter), a magnetic-bead-based approach.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!