Traditional methods for identifying "hit" molecules from a large collection of potential drug-like candidates rely on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between the drug and its protein target. These approaches have a significant limitation in that they require exceptional computing capabilities for even relatively small collections of molecules. Increasingly large and complex state-of-the-art deep learning approaches have gained popularity with the promise to improve the productivity of drug design, notorious for its numerous failures.
View Article and Find Full Text PDFMotivation: Genomic distance estimation is a critical workload since exact computation for whole-genome similarity metrics such as Average Nucleotide Identity (ANI) incurs prohibitive runtime overhead. Genome sketching is a fast and memory-efficient solution to estimate ANI similarity by distilling representative k-mers from the original sequences. In this work, we present HyperGen that improves accuracy, runtime performance, and memory efficiency for large-scale ANI estimation.
View Article and Find Full Text PDFTraditional systems for indoor pressure sensing and human activity recognition (HAR) rely on costly, high-resolution mats and computationally intensive neural network-based (NN-based) models that are prone to noise. In contrast, we design a cost-effective and noise-resilient pressure mat system for HAR, leveraging Velostat for intelligent pressure sensing and a novel hyperdimensional computing (HDC) classifier that is lightweight and highly noise resilient. To measure the performance of our system, we collected two datasets, capturing the static and continuous nature of human movements.
View Article and Find Full Text PDFMotivation: Driven by technological advances, the throughput and cost of mass spectrometry (MS) proteomics experiments have improved by orders of magnitude in recent decades. Spectral library searching is a common approach to annotating experimental mass spectra by matching them against large libraries of reference spectra corresponding to known peptides. An important disadvantage, however, is that only peptides included in the spectral library can be found, whereas novel peptides, such as those with unexpected post-translational modifications (PTMs), will remain unknown.
View Article and Find Full Text PDFAs current shotgun proteomics experiments can produce gigabytes of mass spectrometry data per hour, processing these massive data volumes has become progressively more challenging. Spectral clustering is an effective approach to speed up downstream data processing by merging highly similar spectra to minimize data redundancy. However, because state-of-the-art spectral clustering tools fail to achieve optimal runtimes, this simply moves the processing bottleneck.
View Article and Find Full Text PDFBackground: Schools are high-risk settings for SARS-CoV-2 transmission, but necessary for children's educational and social-emotional wellbeing. Previous research suggests that wastewater monitoring can detect SARS-CoV-2 infections in controlled residential settings with high levels of accuracy. However, its effective accuracy, cost, and feasibility in non-residential community settings is unknown.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
January 2023
Recent years have seen growing interest in leveraging deep learning models for monitoring epilepsy patients based on electroencephalographic (EEG) signals. However, these approaches often exhibit poor generalization when applied outside of the setting in which training data was collected. Furthermore, manual labeling of EEG signals is a time-consuming process requiring expert analysis, making fine-tuning patient-specific models to new settings a costly proposition.
View Article and Find Full Text PDFAtomistic Molecular Dynamics (MD) simulations provide researchers the ability to model biomolecular structures such as proteins and their interactions with drug-like small molecules with greater spatiotemporal resolution than is otherwise possible using experimental methods. MD simulations are notoriously expensive computational endeavors that have traditionally required massive investment in specialized hardware to access biologically relevant spatiotemporal scales. Our goal is to summarize the fundamental algorithms that are employed in the literature to then highlight the challenges that have affected accelerator implementations in practice.
View Article and Find Full Text PDFBrain-inspired Hyper-dimensional(HD) computing is a novel and efficient computing paradigm. However, highly parallel architectures such as Processing-in-Memory(PIM) are bottle-necked by reduction operations required such as accumulation. To reduce this bottle-neck of HD computing in PIM, we present Stochastic-HD that combines the simplicity of operations in Stochastic Computing (SC) with the complex task solving capabilities of the latest HD computing algorithms.
View Article and Find Full Text PDFIncreasing data volumes on high-throughput sequencing instruments such as the NovaSeq 6000 leads to long computational bottlenecks for common metagenomics data preprocessing tasks such as adaptor and primer trimming and host removal. Here, we test whether faster recently developed computational tools (Fastp and Minimap2) can replace widely used choices (Atropos and Bowtie2), obtaining dramatic accelerations with additional sensitivity and minimal loss of specificity for these tasks. Furthermore, the taxonomic tables resulting from downstream processing provide biologically comparable results.
View Article and Find Full Text PDFBackground: Schools are high-risk settings for SARS-CoV-2 transmission, but necessary for children's educational and social-emotional wellbeing. Previous research suggests that wastewater monitoring can detect SARS-CoV-2 infections in controlled residential settings with high levels of accuracy. However, its effective accuracy, cost, and feasibility in non-residential community settings is unknown.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2020
Recent years have seen a growing interest in the development of non-invasive devices capable of detecting seizures which can be worn in everyday life. Such devices must be lightweight and unobtrusive which severely limit their on-board computing power and battery life. In this paper, we propose a novel technique based on hyperdimensional (HD) computing to detect epileptic seizures from 2-channel surface EEG recordings.
View Article and Find Full Text PDF