Publications by Oliver Rubel | LitMetric

Publications by authors named "Oliver Rubel"

Page 1 of 2

Distributed Augmentation, Hypersweeps, and Branch Decomposition of Contour Trees for Scientific Exploration.

Mingzhe Li Hamish Carr Oliver Rubel Bei Wang Gunther H Weber

IEEE Trans Vis Comput Graph

September 2024

Contour trees describe the topology of level sets in scalar fields and are widely used in topological data analysis and visualization. A main challenge of utilizing contour trees for large-scale scientific data is their computation at scale using highperformance computing. To address this challenge, recent work has introduced distributed hierarchical contour trees for distributed computation and storage of contour trees.

View Article and Find Full Text PDF

Spyglass: a framework for reproducible and shareable neuroscience research.

Kyu Hyun Lee Eric L Denovellis Ryan Ly Jeremy Magland Jeff Soules

bioRxiv

April 2024

Scientific progress depends on reliable and reproducible results. Progress can also be accelerated when data are shared and re-analyzed to address new questions. Current approaches to storing and analyzing neural data typically involve bespoke formats and software that make replication, as well as the subsequent reuse of data, difficult if not impossible.

View Article and Find Full Text PDF

Structured behavioral data format: An NWB extension standard for task-based behavioral neuroscience experiments.

Ryan Ly Matthew Avaylon Michael Wulf Adam Kepecs Oliver Rübel

bioRxiv

January 2024

Understanding brain function necessitates linking neural activity with corresponding behavior. Structured behavioral experiments are crucial for probing the neural computations and dynamics underlying behavior; however, adequately representing their complex data is a significant challenge. Currently, a comprehensive data standard that fully encapsulates task-based experiments, integrating neural activity with the richness of behavioral context, is lacking.

View Article and Find Full Text PDF

FAIR for AI: An interdisciplinary and international community building perspective.

E A Huerta Ben Blaiszik L Catherine Brinson Kristofer E Bouchard Daniel Diaz

Sci Data

July 2023

A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets.

View Article and Find Full Text PDF

The Neurodata Without Borders ecosystem for neurophysiological data science.

Oliver Rübel Andrew Tritt Ryan Ly Benjamin K Dichter Satrajit Ghosh

Elife

October 2022

The neurophysiology of cells and tissues are monitored electrophysiologically and optically in diverse experiments and species, ranging from flies to humans. Understanding the brain requires integration of data across this diversity, and thus these data must be findable, accessible, interoperable, and reusable (FAIR). This requires a standard language for data and metadata that can coevolve with neuroscience.

View Article and Find Full Text PDF

Rotational dynamics and transition mechanisms of surface-adsorbed proteins.

Shuai Zhang Robbie Sadre Benjamin A Legg Harley Pyles Talita Perciano

Proc Natl Acad Sci U S A

April 2022

Assembly of biomolecules at solid–water interfaces requires molecules to traverse complex orientation-dependent energy landscapes through processes that are poorly understood, largely due to the dearth of in situ single-molecule measurements and statistical analyses of the rotational dynamics that define directional selection. Emerging capabilities in high-speed atomic force microscopy and machine learning have allowed us to directly determine the orientational energy landscape and observe and quantify the rotational dynamics for protein nanorods on the surface of muscovite mica under a variety of conditions. Comparisons with kinetic Monte Carlo simulations show that the transition rates between adjacent orientation-specific energetic minima can largely be understood through traditional models of in-plane Brownian rotation across a biased energy landscape, with resulting transition rates that are exponential in the energy barriers between states.

View Article and Find Full Text PDF

Optimization and Augmentation for Data Parallel Contour Trees.

Hamish A Carr Oliver Rubel Gunther H Weber James P Ahrens

IEEE Trans Vis Comput Graph

October 2022

Contour trees are used for topological data analysis in scientific visualization. While originally computed with serial algorithms, recent work has introduced a vector-parallel algorithm. However, this algorithm is relatively slow for fully augmented contour trees which are needed for many practical data analysis tasks.

View Article and Find Full Text PDF

Predicting and Comparing the Performance of Array Management Libraries.

Donghe Kang Oliver Rübel Suren Byna Spyros Blanas

Proc IPDPS (Conf)

May 2020

Many applications are increasingly becoming I/O-bound. To improve scalability, analytical models of parallel I/O performance are often consulted to determine possible I/O optimizations. However, I/O performance modeling has predominantly focused on applications that directly issue I/O requests to a parallel file system or a local storage device.

View Article and Find Full Text PDF

HDMF: Hierarchical Data Modeling Framework for Modern Science Data Standards.

Andrew J Tritt Oliver Rübel Benjamin Dichter Ryan Ly Donghe Kang

Proc IEEE Int Conf Big Data

December 2019

A ubiquitous problem in aggregating data across different experimental and observational data sources is a lack of software infrastructure that enables flexible and extensible standardization of data and metadata. To address this challenge, we developed HDMF, a hierarchical data modeling framework for modern science data standards. With HDMF, we separate the process of data standardization into three main components: (1) data modeling and specification, (2) data I/O and storage, and (3) data interaction and data APIs.

View Article and Find Full Text PDF

Scalable Contour Tree Computation by Data Parallel Peak Pruning.

Hamish A Carr Gunther H Weber Christopher M Sewell Oliver Rubel Patricia Fasel

IEEE Trans Vis Comput Graph

April 2021

As data sets grow to exascale, automated data analysis and visualization are increasingly important, to intermediate human understanding and to reduce demands on disk storage via in situ analysis. Trends in architecture of high performance computing systems necessitate analysis algorithms to make effective use of combinations of massively multicore and distributed systems. One of the principal analytic tools is the contour tree, which analyses relationships between contours to identify features of more than local importance.

View Article and Find Full Text PDF

MAGI: A Method for Metabolite Annotation and Gene Integration.

Onur Erbilgin Oliver Rübel Katherine B Louie Matthew Trinh Markus de Raad

ACS Chem Biol

April 2019

Metabolomics is a widely used technology for obtaining direct measures of metabolic activities from diverse biological systems. However, ambiguous metabolite identifications are a common challenge and biochemical interpretation is often limited by incomplete and inaccurate genome-based predictions of enzyme activities (that is, gene annotations). Metabolite Annotation and Gene Integration (MAGI) generates a metabolite-gene association score using a biochemical reaction network.

View Article and Find Full Text PDF

BASTet: Shareable and Reproducible Analysis and Visualization of Mass Spectrometry Imaging Data via OpenMSI.

Oliver Rubel Benjamin P Bowen

IEEE Trans Vis Comput Graph

January 2018

Mass spectrometry imaging (MSI) is a transformative imaging method that supports the untargeted, quantitative measurement of the chemical composition and spatial heterogeneity of complex samples with broad applications in life sciences, bioenergy, and health. While MSI data can be routinely collected, its broad application is currently limited by the lack of easily accessible analysis methods that can process data of the size, volume, diversity, and complexity generated by MSI experiments. The development and application of cutting-edge analytical methods is a core driver in MSI research for new scientific discoveries, medical diagnostics, and commercial-innovation.

View Article and Find Full Text PDF

OpenMSI Arrayed Analysis Toolkit: Analyzing Spatially Defined Samples Using Mass Spectrometry Imaging.

Markus de Raad Tristan de Rond Oliver Rübel Jay D Keasling Trent R Northen

Anal Chem

June 2017

Mass spectrometry imaging (MSI) has primarily been applied in localizing biomolecules within biological matrices. Although well-suited, the application of MSI for comparing thousands of spatially defined spotted samples has been limited. One reason for this is a lack of suitable and accessible data processing tools for the analysis of large arrayed MSI sample sets.

View Article and Find Full Text PDF

WarpIV: In Situ Visualization and Analysis of Ion Accelerator Simulations.

Oliver Rubel Burlen Loring Jean-Luc Vay David P Grote Remi Lehe

IEEE Comput Graph Appl

May 2017

The generation of short pulses of ion beams through the interaction of an intense laser with a plasma sheath offers the possibility of compact and cheaper ion sources for many applications--from fast ignition and radiography of dense targets to hadron therapy and injection into conventional accelerators. To enable the efficient analysis of large-scale, high-fidelity particle accelerator simulations using the Warp simulation suite, the authors introduce the Warp In situ Visualization Toolkit (WarpIV). WarpIV integrates state-of-the-art in situ visualization and analysis using VisIt with Warp, supports management and control of complex in situ visualization and analysis workflows, and implements integrated analytics to facilitate query- and feature-based data analytics and efficient large-scale data analysis.

View Article and Find Full Text PDF

Methods for Specifying Scientific Data Standards and Modeling Relationships with Applications to Neuroscience.

Oliver Rübel Max Dougherty Prabhat Peter Denes David Conant

Front Neuroinform

November 2016

Neuroscience continues to experience a tremendous growth in data; in terms of the volume and variety of data, the velocity at which data is acquired, and in turn the veracity of data. These challenges are a serious impediment to sharing of data, analyses, and tools within and across labs. Here, we introduce BRAINformat, a novel data standardization framework for the design and management of scientific data formats.

View Article and Find Full Text PDF

Dirigent Protein-Mediated Lignan and Cyanogenic Glucoside Formation in Flax Seed: Integrated Omics and MALDI Mass Spectrometry Imaging.

Doralyn S Dalisay Kye Won Kim Choonseok Lee Hong Yang Oliver Rübel

J Nat Prod

June 2015

An integrated omics approach using genomics, transcriptomics, metabolomics (MALDI mass spectrometry imaging, MSI), and bioinformatics was employed to study spatiotemporal formation and deposition of health-protecting polymeric lignans and plant defense cyanogenic glucosides. Intact flax (Linum usitatissimum) capsules and seed tissues at different development stages were analyzed. Transcriptome analyses indicated distinct expression patterns of dirigent protein (DP) gene family members encoding (-)- and (+)-pinoresinol-forming DPs and their associated downstream metabolic processes, respectively, with the former expressed at early seed coat development stages.

View Article and Find Full Text PDF

Identifying important ions and positions in mass spectrometry imaging data using CUR matrix decompositions.

Jiyan Yang Oliver Rübel Prabhat Michael W Mahoney Benjamin P Bowen

Anal Chem

November 2015

Mass spectrometry imaging enables label-free, high-resolution spatial mapping of the chemical composition of complex, biological samples. Typical experiments require selecting ions and/or positions from the images: ions for fragmentation studies to identify keystone compounds and positions for follow up validation measurements using microdissection or other orthogonal techniques. Unfortunately, with modern imaging machines, these must be selected from an overwhelming amount of raw data.

View Article and Find Full Text PDF

A collaborative visual analytics suite for protein folding research.

William Harvey In-Hee Park Oliver Rübel Valerio Pascucci Peer-Timo Bremer

J Mol Graph Model

September 2014

Molecular dynamics (MD) simulation is a crucial tool for understanding principles behind important biochemical processes such as protein folding and molecular interaction. With the rapidly increasing power of modern computers, large-scale MD simulation experiments can be performed regularly, generating huge amounts of MD data. An important question is how to analyze and interpret such massive and complex data.

View Article and Find Full Text PDF

Feature-based analysis of plasma-based particle acceleration data.

Oliver Rübel Cameron G R Geddes Min Chen Estelle Cormier-Michel E Wes Bethel

IEEE Trans Vis Comput Graph

February 2014

Plasma-based particle accelerators can produce and sustain thousands of times stronger acceleration fields than conventional particle accelerators, providing a potential solution to the problem of the growing size and cost of conventional particle accelerators. To facilitate scientific knowledge discovery from the ever growing collections of accelerator simulation data generated by accelerator physicists to investigate next-generation plasma-based particle accelerator designs, we describe a novel approach for automatic detection and classification of particle beams and beam substructures due to temporal differences in the acceleration process, here called acceleration features. The automatic feature detection in combination with a novel visualization tool for fast, intuitive, query-based exploration of acceleration features enables an effective top-down data exploration process, starting from a high-level, feature-based view down to the level of individual particles.

View Article and Find Full Text PDF

OpenMSI: a high-performance web-based platform for mass spectrometry imaging.

Oliver Rübel Annette Greiner Shreyas Cholia Katherine Louie E Wes Bethel

Anal Chem

November 2013

Mass spectrometry imaging (MSI) enables researchers to directly probe endogenous molecules directly within the architecture of the biological matrix. Unfortunately, efficient access, management, and analysis of the data generated by MSI approaches remain major challenges to this rapidly developing field. Despite the availability of numerous dedicated file formats and software packages, it is a widely held viewpoint that the biggest challenge is simply opening, sharing, and analyzing a file without loss of information.

View Article and Find Full Text PDF

Morse-Smale Regression.

Samuel Gerber Oliver Rübel Peer-Timo Bremer Valerio Pascucci Ross T Whitaker

J Comput Graph Stat

January 2013

This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation.

View Article and Find Full Text PDF

High-Throughput Characterization of Porous Materials Using Graphics Processing Units.

Jihan Kim Richard L Martin Oliver Rübel Maciej Haranczyk Berend Smit

J Chem Theory Comput

May 2012

We have developed a high-throughput graphics processing unit (GPU) code that can characterize a large database of crystalline porous materials. In our algorithm, the GPU is utilized to accelerate energy grid calculations, where the grid values represent interactions (i.e.

View Article and Find Full Text PDF

Coupling visualization and data analysis for knowledge discovery from multi-dimensional scientific data.

Oliver Rübel Sean Ahern E Wes Bethel Mark D Biggin Hank Childs

Procedia Comput Sci

May 2010

Knowledge discovery from large and complex scientific data is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the growing number of data dimensions and data objects presents tremendous challenges for effective data analysis and data exploration methods and tools. The combination and close integration of methods from scientific visualization, information visualization, automated data analysis, and other enabling technologies -such as efficient data management- supports knowledge discovery from multi-dimensional scientific data.

View Article and Find Full Text PDF

Integrating data clustering and visualization for the analysis of 3D gene expression data.

Oliver Rübel Gunther H Weber Min-Yu Huang E Wes Bethel Mark D Biggin

IEEE/ACM Trans Comput Biol Bioinform

May 2010

The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex data sets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible.

View Article and Find Full Text PDF

Visual exploration of three-dimensional gene expression using physical views and linked abstract views.

Gunther H Weber Oliver Rübel Min-Yu Huang Angela H DePace Charless C Fowlkes

IEEE/ACM Trans Comput Biol Bioinform

August 2009

During animal development, complex patterns of gene expression provide positional information within the embryo. To better understand the underlying gene regulatory networks, the Berkeley Drosophila Transcription Network Project (BDTNP) has developed methods that support quantitative computational analysis of three-dimensional (3D) gene expression in early Drosophila embryos at cellular resolution. We introduce PointCloudXplore (PCX), an interactive visualization tool that supports visual exploration of relationships between different genes' expression using a combination of established visualization techniques.

View Article and Find Full Text PDF