Sound Can Help Us See More Clearly.

Sensors (Basel)

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China.

Published: January 2022

In the field of video action classification, existing network frameworks often only use video frames as input. When the object involved in the action does not appear in a prominent position in the video frame, the network cannot accurately classify it. We introduce a new neural network structure that uses sound to assist in processing such tasks. The original sound wave is converted into sound texture as the input of the network. Furthermore, in order to use the rich modal information (images and sound) in the video, we designed and used a two-stream frame. In this work, we assume that sound data can be used to solve motion recognition tasks. To demonstrate this, we designed a neural network based on sound texture to perform video action classification tasks. Then, we fuse this network with a deep neural network that uses continuous video frames to construct a two-stream network, which is called A-IN. Finally, in the kinetics dataset, we use our proposed A-IN to compare with the image-only network. The experimental results show that the recognition accuracy of the two-stream neural network model with uesed sound data features is increased by 7.6% compared with the network using video frames. This proves that the rational use of the rich information in the video can improve the classification effect.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8778024PMC
http://dx.doi.org/10.3390/s22020599DOI Listing

Publication Analysis

Top Keywords

neural network
16
video frames
12
network
11
sound
8
video
8
video action
8
action classification
8
sound texture
8
sound data
8
sound help
4

Similar Publications

RNA Translocation through Protein Nanopores: Interlude of the Molten RNA Globule.

J Am Chem Soc

January 2025

Department of Polymer Science and Engineering, University of Massachusetts, Amherst, Massachusetts 01003, United States.

Direct translocation of RNA with secondary structures using single-molecule electrophoresis through protein nanopores shows significant fluctuations in the measured ionic current, in contrast to unstructured single-stranded RNA or DNA. We developed a multiscale model combining the oxRNA model for RNA with the 3-dimensional Poisson-Nernst-Planck formalism for electric fields within protein pores, aiming to map RNA conformations to ionic currents as RNA translocates through three protein nanopores: α-hemolysin, CsgG, and MspA. Our findings reveal three distinct stages of translocation (pseudoknot, melting, and molten globule) based on contact maps and current values.

View Article and Find Full Text PDF

Directed Electrostatics Strategy Integrated as a Graph Neural Network Approach for Accelerated Cluster Structure Prediction.

J Chem Theory Comput

January 2025

Advanced Artificial Intelligence Theoretical and Computational Chemistry Laboratory, School of Chemistry, University of Hyderabad, Hyderabad, Telangana 500046, India.

We present a directed electrostatics strategy integrated as a graph neural network (DESIGNN) approach for predicting stable nanocluster structures on their potential energy surfaces (PESs). The DESIGNN approach is a graph neural network (GNN)-based model for building structures of large atomic clusters with specific sizes and point-group symmetry. This model assists in the structure building of atomic metal clusters by predicting molecular electrostatic potential (MESP) topography minima on their structural evolution paths.

View Article and Find Full Text PDF

This study introduces a hybrid network model for phase classification, integrating quantum networks and complex-valued neural networks. This architecture uses elemental composition as its only input, eliminating complex feature engineering. Parameterized quantum networks handle sparse elemental data and convert data from real to complex domains, increasing information dimensionality.

View Article and Find Full Text PDF

Proper polarization of newly generated neurons is a critical process for neural network formation and brain development. The pan-neurotrophin p75 receptor plays a key role in this process localizing asymmetrically in one of the differentiating neurites and specifying its axonal identity in response to neurotrophins. During axonal specification, p75 levels are transiently modulated, yet the molecular mechanisms underlying this process are not known.

View Article and Find Full Text PDF

A key property of our environment is the mirror symmetry of many objects, although symmetry is an abstract global property with no definable shape template, making symmetry identification a challenge for standard template-matching algorithms. We therefore ask whether Deep Neural Networks (DNNs) trained on typical natural environmental images develop a selectivity for symmetry similar to that of the human brain. We tested a DNN trained on such typical natural images with object-free random-dot images of 1, 2, and 4 symmetry axes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!