Background: In genotyping-by-sequencing (GBS) and restriction site-associated DNA sequencing (RAD-seq), read depth is important for assessing the quality of genotype calls and estimating allele dosage in polyploids. However, existing pipelines for GBS and RAD-seq do not provide read counts in formats that are both accurate and easy to access. Additionally, although existing pipelines allow previously-mined SNPs to be genotyped on new samples, they do not allow the user to manually specify a subset of loci to examine. Pipelines that do not use a reference genome assign arbitrary names to SNPs, making meta-analysis across projects difficult.
Results: We created the software TagDigger, which includes three programs for analyzing GBS and RAD-seq data. The first script, tagdigger_interactive.py, rapidly extracts read counts and genotypes from FASTQ files using user-supplied sets of barcodes and tags. Input and output is in CSV format so that it can be opened by spreadsheet software. Tag sequences can also be imported from the Stacks, TASSEL-GBSv2, TASSEL-UNEAK, or pyRAD pipelines, and a separate file can be imported listing the names of markers to retain. A second script, tag_manager.py, consolidates marker names and sequences across multiple projects. A third script, barcode_splitter.py, assists with preparing FASTQ data for deposit in a public archive by splitting FASTQ files by barcode and generating MD5 checksums for the resulting files.
Conclusions: TagDigger is open-source and freely available software written in Python 3. It uses a scalable, rapid search algorithm that can process over 100 million FASTQ reads per hour. TagDigger will run on a laptop with any operating system, does not consume hard drive space with intermediate files, and does not require programming skill to use.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4940913 | PMC |
http://dx.doi.org/10.1186/s13029-016-0057-7 | DOI Listing |
Sci Data
January 2025
Department of Molecular Science and Technology, Ajou University, Suwon, 16499, Republic of Korea.
Chinese hamster ovary (CHO) cells play a pivotal role in the production of recombinant therapeutics. In the present study, we conducted a genome-scale pooled CRISPR knockout (KO) screening using a virus-free, recombinase-mediated cassette exchange-based platform in CHO-K1 host and CHO-K1 derived recombinant cells. Genome-wide guide RNA (gRNA) amplicon sequencing data were generated from cell libraries, as well as short- and long-term KO libraries, and validated through phenotypic assessment and gRNA read count distribution.
View Article and Find Full Text PDFIJID Reg
March 2025
SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Human Genetics, Stellenbosch University, Cape Town, South Africa.
Objectives: Nontuberculous mycobacteria (NTM) are increasingly recognized opportunistic pathogens found ubiquitously in the environment. The presence of multiple NTM species at the site of disease complicates diagnosis and treatment.
Case And Management: A 40-year-old patient who tested positive for HIV, with an absolute clusters of differentiation 4+ T-cell count of 3 cells/µl and cryptococcaemia, presented with hemoptysis, productive cough, and weight loss.
Microrna
January 2025
Department of Pathology, All India Institute of Medical Sciences, New Delhi, India.
Introduction: Micro ribonucleic acids (miRNAs) are small non-coding RNAs that modulate the expression of various genes. They have an important role in cancer pathogenesis. Differential expression of multiple miRNAs have been used as potential diagnostic and prognostic markers.
View Article and Find Full Text PDFCan Med Educ J
December 2024
Department of Ophthalmology, Queen's University, Ontario, Canada.
Background: The purpose of this study was to investigate the effect of word choice on the quality of narrative feedback in ophthalmology resident trainee assessments following the introduction of competency-based medical education at Queen's University.
Methods: Assessment data from July 2017-December 2020 were retrieved from Elentra (Integrated Teaching and Learning Platform) and anonymized. Written feedback was assigned a Quality of Assessment for Learning (QuAL) score out of five based on this previously validated rubric.
Mol Biol Evol
January 2025
Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany.
Plant cells have two major organelles with their own genomes: chloroplasts and mitochondria. While chloroplast genomes tend to be structurally conserved, the mitochondrial genomes of plants, which are much larger than those of animals, are characterized by complex structural variation. We introduce TIPPo, a user-friendly, reference-free assembly tool that uses PacBio high-fidelity long-read data and that does not rely on genomes from related species or nuclear genome information for the assembly of organellar genomes.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!