In the field of video action classification, existing network frameworks often only use video frames as input. When the object involved in the action does not appear in a prominent position in the video frame, the network cannot accurately classify it. We introduce a new neural network structure that uses sound to assist in processing such tasks. The original sound wave is converted into sound texture as the input of the network. Furthermore, in order to use the rich modal information (images and sound) in the video, we designed and used a two-stream frame. In this work, we assume that sound data can be used to solve motion recognition tasks. To demonstrate this, we designed a neural network based on sound texture to perform video action classification tasks. Then, we fuse this network with a deep neural network that uses continuous video frames to construct a two-stream network, which is called A-IN. Finally, in the kinetics dataset, we use our proposed A-IN to compare with the image-only network. The experimental results show that the recognition accuracy of the two-stream neural network model with uesed sound data features is increased by 7.6% compared with the network using video frames. This proves that the rational use of the rich information in the video can improve the classification effect.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8778024 | PMC |
http://dx.doi.org/10.3390/s22020599 | DOI Listing |
J Am Chem Soc
January 2025
Department of Polymer Science and Engineering, University of Massachusetts, Amherst, Massachusetts 01003, United States.
Direct translocation of RNA with secondary structures using single-molecule electrophoresis through protein nanopores shows significant fluctuations in the measured ionic current, in contrast to unstructured single-stranded RNA or DNA. We developed a multiscale model combining the oxRNA model for RNA with the 3-dimensional Poisson-Nernst-Planck formalism for electric fields within protein pores, aiming to map RNA conformations to ionic currents as RNA translocates through three protein nanopores: α-hemolysin, CsgG, and MspA. Our findings reveal three distinct stages of translocation (pseudoknot, melting, and molten globule) based on contact maps and current values.
View Article and Find Full Text PDFJ Chem Theory Comput
January 2025
Advanced Artificial Intelligence Theoretical and Computational Chemistry Laboratory, School of Chemistry, University of Hyderabad, Hyderabad, Telangana 500046, India.
We present a directed electrostatics strategy integrated as a graph neural network (DESIGNN) approach for predicting stable nanocluster structures on their potential energy surfaces (PESs). The DESIGNN approach is a graph neural network (GNN)-based model for building structures of large atomic clusters with specific sizes and point-group symmetry. This model assists in the structure building of atomic metal clusters by predicting molecular electrostatic potential (MESP) topography minima on their structural evolution paths.
View Article and Find Full Text PDFiScience
January 2025
School of Mathematics and Statistics, Zhengzhou University, Zhengzhou 450001, China.
This study introduces a hybrid network model for phase classification, integrating quantum networks and complex-valued neural networks. This architecture uses elemental composition as its only input, eliminating complex feature engineering. Parameterized quantum networks handle sparse elemental data and convert data from real to complex domains, increasing information dimensionality.
View Article and Find Full Text PDFiScience
January 2025
European Brain Research Institute (EBRI), Fondazione Rita Levi-Montalcini, Viale Regina Elena 295, 00161 Rome, Italy.
Proper polarization of newly generated neurons is a critical process for neural network formation and brain development. The pan-neurotrophin p75 receptor plays a key role in this process localizing asymmetrically in one of the differentiating neurites and specifying its axonal identity in response to neurotrophins. During axonal specification, p75 levels are transiently modulated, yet the molecular mechanisms underlying this process are not known.
View Article and Find Full Text PDFiScience
January 2025
Division of Optometry, Health Sciences, City University of London, London EC1V 0HB, UK.
A key property of our environment is the mirror symmetry of many objects, although symmetry is an abstract global property with no definable shape template, making symmetry identification a challenge for standard template-matching algorithms. We therefore ask whether Deep Neural Networks (DNNs) trained on typical natural environmental images develop a selectivity for symmetry similar to that of the human brain. We tested a DNN trained on such typical natural images with object-free random-dot images of 1, 2, and 4 symmetry axes.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!