Motivation: Accurate detection, genotyping and downstream analysis of genomic variants from high-throughput sequencing data are fundamental features in modern production pipelines for genetic-based diagnosis in medicine or genomic selection in plant and animal breeding. Our research group maintains the Next-Generation Sequencing Experience Platform (NGSEP) as a precise, efficient and easy-to-use software solution for these features.
Results: Understanding that incorrect alignments around short tandem repeats are an important source of genotyping errors, we implemented in NGSEP new algorithms for realignment and haplotype clustering of reads spanning indels and short tandem repeats. We performed extensive benchmark experiments comparing NGSEP to state-of-the-art software using real data from three sequencing protocols and four species with different distributions of repetitive elements. NGSEP consistently shows comparative accuracy and better efficiency compared to the existing solutions. We expect that this work will contribute to the continuous improvement of quality in variant calling needed for modern applications in medicine and agriculture.
Availability And Implementation: NGSEP is available as open source software at http://ngsep.sf.net.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6853766 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btz275 | DOI Listing |
PLoS One
January 2025
School of Government, Adolfo Ibanez University, Santiago, Chile.
This study demonstrates the use of GPT-4 and variants, advanced language models readily accessible to many social scientists, in extracting political networks from text. This approach showcases the novel integration of GPT-4's capabilities in entity recognition, relation extraction, entity linking, and sentiment analysis into a single cohesive process. Based on a corpus of 1009 Chilean political news articles, the study validates the graph extraction method using 'legislative agreement', i.
View Article and Find Full Text PDFCirc Heart Fail
January 2025
Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité, Berlin, Germany (M.L.M., U.L., B.H., D.M., A.B., I.M., S.S.).
Background: Despite previous histopathologic evidence for its presence, the role of myocardial inflammation in the development and progression of cardiac transthyretin amyloidosis (ATTR-CA) remains insufficiently understood. Thus, this study sought to characterize the prevalence and potential prognostic implications of myocardial inflammation in ATTR-CA.
Methods: A retrospective observational study including patients with ATTR-CA diagnosed by endomyocardial biopsy was conducted.
Am J Hum Genet
January 2025
UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA. Electronic address:
More than 50% of families with suspected rare monogenic diseases remain unsolved after whole-genome analysis by short-read sequencing (SRS). Long-read sequencing (LRS) could help bridge this diagnostic gap by capturing variants inaccessible to SRS, facilitating long-range mapping and phasing and providing haplotype-resolved methylation profiling. To evaluate LRS's additional diagnostic yield, we sequenced a rare-disease cohort of 98 samples from 41 families, using nanopore sequencing, achieving per sample ∼36× average coverage and 32-kb read N50 from a single flow cell.
View Article and Find Full Text PDFAnim Genet
February 2025
College of Animal Science and Technology, Southwest University, Chongqing, China.
Goats typically have double coats, with the outermost coarse hairs providing protection against mechanical and radiation damage. While much attention has been paid to cashmere due to its status as a high-end textile material, there is limited information available on coarse hair. This study aimed to identify genomic variants, such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels), associated with coarse hair diameter using a genome-wide association study (GWAS).
View Article and Find Full Text PDFMethods
January 2025
Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada. Electronic address:
This paper proposes a detailed process for SV calling that permits a data-driven assessment of multiple SV callers that uses both genome assemblies and long-reads. The process is implemented as a software pipeline named Structural Variant - Jaccard Index Measure, or SVJIM, using the Snakemake [20] workflow management system. Like most state-of-the-art SV callers, SV-JIM detects the presence of variations between pairs of genomes, but it streamlines the numerous SV calling stages into a single process for user convenience and evaluates the multiple SV sets produced using the Jaccard index measure to identify those with the highest consistency among the included SV callers.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!