Deep Learning on Big, Sparse, Behavioral Data.

Big Data

Technology, Operations & Statistics Department, NYU Stern School of Business, New York, New York.

Published: December 2019

The outstanding performance of deep learning (DL) for computer vision and natural language processing has fueled increased interest in applying these algorithms more broadly in both research and practice. This study investigates the application of DL techniques to classification of large sparse behavioral data-which has become ubiquitous in the age of big data collection. We report on an extensive search through DL architecture variants and compare the predictive performance of DL with that of carefully regularized logistic regression (LR), which previously (and repeatedly) has been found to be the most accurate machine learning technique generally for sparse behavioral data. At a high level, we demonstrate that by following recommendations from the literature, researchers and practitioners who are not DL experts can achieve world-class performance using DL. More specifically, we report several findings. As a main result, applying DL on 39 big sparse behavioral classification tasks demonstrates a significant performance improvement compared with LR. A follow-up result suggests that if one were to choose the best shallow technique (rather than just LR), there still would often be an improvement from using DL, but that in this case the magnitude of the improvement might not justify the high cost. Investigating DL performs better, we find that worse performance is obtained for data sets with low signal-from-noise separability-in line with prior results comparing linear and nonlinear classifiers. Exploring the deep architectures work well, we show that using the first-layer features learned by DL yields better generalization performance for a linear model than do unsupervised feature-reduction methods (e.g., singular-value decomposition). However, to do well enough to beat well-regularized LR with the original sparse representation, more layers from the deep distributed architecture are needed. With respect to how deep models come to their decisions, we demonstrate how the neurons on the lowest layer of the deep architecture capture nuances from the raw fine-grained features and allow intuitive interpretation. Looking forward, we propose the use of instance-level counterfactual explanations to gain insight into why deep models classify individual data instances the way they do.

Download full-text PDF

Source
http://dx.doi.org/10.1089/big.2019.0095DOI Listing

Publication Analysis

Top Keywords

sparse behavioral
16
deep learning
8
big sparse
8
behavioral data
8
deep models
8
deep
7
performance
6
sparse
5
data
5
learning big
4

Similar Publications

Priority Clinical Actions for Outpatient Management of Nonhospitalized Traumatic Brain Injury.

J Neurotrauma

January 2025

Zuckerberg San Francisco General Hosptial and Trauma Center, University of California, San Francisco, San Francisco, California, USA.

Outpatient care following nonhospitalized traumatic brain injury (TBI) is variable, and often sparse. The National Academies of Sciences, Engineering, and Medicine's 2022 report on highlighted the need to improve the consistency and quality of TBI care in the community. In response, the present study aimed to identify existing evidence-based guidance and specific clinical actions over the days to months following nonhospitalized TBI that should be prioritized for implementation in primary care.

View Article and Find Full Text PDF

Technology-critical elements (TCEs), essential in emerging technologies, are increasingly finding their way into our environment, raising concerns about their sparsely studied behavior and toxicity. To contribute insights into the toxicological aspects, we employed bioassays to investigate the possible cytotoxic effects in four representative cell lines (AR-EcoScreen GR-KO-M1, DR-EcoScreen, MCF7AREc32, VM7Luc4E2) and the potential to induce oxidative stress via the nuclear factor erythroid 2-related factor 2 (Nrf2) pathway for a number of these elements. Nine TCEs, three rare-earth elements (REEs: Gd, Nd, Yb) and six less-studied TCEs (LSTCEs: Ga, Ge, In, Ta, Te, Tl), were selected for this study, along with three well-studied traditional metal contaminants (TMCs: As, Cd, Pb) for comparison.

View Article and Find Full Text PDF

We present a Spiking Neural Network (SNN) model that incorporates learnable synaptic delays through two approaches: per-synapse delay learning via Dilated Convolutions with Learnable Spacings (DCLS) and a dynamic pruning strategy that also serves as a form of delay learning. In the latter approach, the network dynamically selects and prunes connections, optimizing the delays in sparse connectivity settings. We evaluate both approaches on the Raw Heidelberg Digits keyword spotting benchmark using Backpropagation Through Time with surrogate gradients.

View Article and Find Full Text PDF

Deep learning can help uncover patterns in resting-state functional Magnetic Resonance Imaging (rs-fMRI) associated with psychiatric disorders and personal traits. Yet the problem of interpreting deep learning findings is rarely more evident than in fMRI analyses, as the data is sensitive to scanning effects and inherently difficult to visualize. We propose a simple approach to mitigate these challenges grounded on sparsification and self-supervision.

View Article and Find Full Text PDF

The scientific establishment of the Ecological Security Pattern (ESP) is crucial for fostering the synergistic development of ecological and recreational functions, thereby enhancing urban ecological protection, recreational development, and sustainable growth. This study aimed to propose a novel method of constructing ESP considering both ecological and recreational functions, and to reconstruct ESP by weighing the relationship between ecological protection and recreational development. Utilizing Fuzhou City as a case study, a comprehensive application of methodologies including Morphological Spatial Pattern Analysis (MSPA), landscape connectivity analysis, ArcGIS spatial analysis, social network analysis (SNA), and circuit theory is employed to develop both the ESP and the Recreational Spatial Pattern (RSP).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!