Motivation: Whole genome shotgun based next-generation transcriptomics and metagenomics studies often generate 100-1000 GB sequence data derived from tens of thousands of different genes or microbial species. Assembly of these data sets requires tradeoffs between scalability and accuracy. Current assembly methods optimized for scalability often sacrifice accuracy and vice versa. An ideal solution would both scale and produce optimal accuracy for individual genes or genomes.
Results: Here we describe an Apache Spark-based scalable sequence clustering application, SparkReadClust (SpaRC), that partitions reads based on their molecule of origin to enable downstream assembly optimization. SpaRC produces high clustering performance on transcriptomes and metagenomes from both short and long read sequencing technologies. It achieves near-linear scalability with input data size and number of compute nodes. SpaRC can run on both cloud computing and HPC environments without modification while delivering similar performance. Our results demonstrate that SpaRC provides a scalable solution for clustering billions of reads from next-generation sequencing experiments, and Apache Spark represents a cost-effective solution with rapid development/deployment cycles for similar large-scale sequence data analysis problems.
Availability And Implementation: https://bitbucket.org/berkeleylab/jgi-sparc.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/bty733 | DOI Listing |
bioRxiv
August 2024
Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158.
The macrodomain contained in the SARS-CoV-2 non-structural protein 3 (NSP3) is required for viral pathogenesis and lethality. Inhibitors that block the macrodomain could be a new therapeutic strategy for viral suppression. We previously performed a large-scale X-ray crystallography-based fragment screen and discovered a sub-micromolar inhibitor by fragment linking.
View Article and Find Full Text PDFTrials
March 2024
Centre for Research in Intellectual and Developmental Disabilities (CIDD), University of Warwick, Warwick, CV4 7AL, UK.
Background: Within England, children and young people (CYP) who come into police custody are referred to Liaison and Diversion (L&D) teams. L&D teams have responsibility for liaising with healthcare and other support services while working to divert CYP away from the criminal justice system but have traditionally not provided targeted psychological interventions to CYP. Considering evidence that Solution Focused Brief Therapy (SFBT) leads to a reduction in internalising and externalising behaviour problems in CYP, the aim of this randomised controlled trial (RCT) was to determine whether there is a difference between services as usual (SAU) plus SFBT offered by trained therapists working within a L&D team, and SAU alone, in reducing offending behaviours in 10-17-year-olds presenting at police custody.
View Article and Find Full Text PDFPain
April 2024
Quiver Bioscience, Cambridge, MA, United States.
Chronic pain associated with osteoarthritis (OA) remains an intractable problem with few effective treatment options. New approaches are needed to model the disease biology and to drive discovery of therapeutics. We present an in vitro model of OA pain, where dorsal root ganglion (DRG) sensory neurons were sensitized by a defined mixture of disease-relevant inflammatory mediators, here called Sensitizing PAin Reagent Composition or SPARC.
View Article and Find Full Text PDFSci Rep
September 2023
Department of Chemical Engineering, Indian Institute of Technology Madras, Chennai, 600036, India.
Emerging pollutants and a large volume of unused dyes from the textile industry have been contaminating water bodies. This work introduces a scalable approach to purifying water by the adsorption of Acid green 25 (AG), Crystal Violet (CV), and Sulfamethoxazole (SMA) from an aqueous solution by graphene oxide (GO) doped modified silica aerogel (GO-SA) with supercritical fluid deposition (SFD) method. Characterization of GO-SA using X-ray diffraction (XRD), Fourier-transform infrared spectroscopy (FTIR), high-resolution scanning electron microscopy (HR-SEM), thermogravimetric analysis (TGA), and Brunauer-Emmett-Teller (BET) adsorption isotherms revealed the improvement in the adsorbent surface area, and its textural properties.
View Article and Find Full Text PDFPLoS Genet
April 2021
Centre for Discovery Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom.
Zebrafish exhibit robust regeneration following spinal cord injury, promoted by macrophages that control post-injury inflammation. However, the mechanistic basis of how macrophages regulate regeneration is poorly understood. To address this gap in understanding, we conducted a rapid in vivo phenotypic screen for macrophage-related genes that promote regeneration after spinal injury.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!