To discriminate and to recognize sound sources in a noisy, reverberant environment, listeners need to perceptually integrate the direct wave with the reflections of each sound source. It has been confirmed that perceptual fusion between direct and reflected waves of a speech sound helps listeners recognize this speech sound in a simulated reverberant environment with disrupting sound sources. When the delay between a direct sound wave and its reflected wave is sufficiently short, the two waves are perceptually fused into a single sound image as coming from the source location. Interestingly, compared with nonspeech sounds such as clicks and noise bursts, speech sounds have a much larger perceptual fusion tendency. This study investigated why the fusion tendency for speech sounds is so large. Here we show that when the temporal amplitude fluctuation of speech was artificially time reversed, a large perceptual fusion tendency of speech sounds disappeared, regardless of whether the speech acoustic carrier was in normal or reversed temporal order. Moreover, perceptual fusion of normal-order speech, but not that of time-reversed speech, was accompanied by increased coactivation of the attention-control-related, spatial-processing-related, and speech-processing-related cortical areas. Thus, speech-like acoustic carriers modulated by speech amplitude fluctuation selectively activate a cortical network for top-down modulations of speech processing, leading to an enhancement of perceptual fusion of speech sounds. This mechanism represents a perceptual-grouping strategy for unmasking speech under adverse conditions.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1162/jocn.2010.21470 | DOI Listing |
Sci Rep
January 2025
School of Electronic Information and Engineering, Changchun University of Science and Technology, Changchun, China.
Detecting ship targets in remote sensing images within complex scenarios faces numerous challenges. The limited feature information of small-scale targets and their random orientation angles often result in missed and false detections. To address these issues, this paper proposes a Multi-Scale Rotated Detection Network (MSRO-Net) for detecting rotated ship targets in remote sensing images.
View Article and Find Full Text PDFPLoS One
January 2025
Colleage of Computer Science and Engineering, Chongqing University of Technology, Chongqing, China.
Target tracking techniques in the UAV perspective utilize UAV cameras to capture video streams and identify and track specific targets in real-time. Deep learning UAV target tracking methods based on the Siamese family have achieved significant results but still face challenges regarding accuracy and speed compatibility. In this study, in order to refine the feature representation and reduce the computational effort to improve the efficiency of the tracker, we perform feature fusion in deep inter-correlation operations and introduce a global attention mechanism to enhance the model's field of view range and feature refinement capability to improve the tracking performance for small targets.
View Article and Find Full Text PDFCogn Neurodyn
December 2025
Image Processing Laboratory, University of Valencia, Valencia, Spain.
In recent years, substantial strides have been made in the field of visual image reconstruction, particularly in its capacity to generate high-quality visual representations from human brain activity while considering semantic information. This advancement not only enables the recreation of visual content but also provides valuable insights into the intricate processes occurring within high-order functional brain regions, contributing to a deeper understanding of brain function. However, considering fusion semantics in reconstructing visual images from brain activity involves semantic-to-image guide reconstruction and may ignore underlying neural computational mechanisms, which does not represent true reconstruction from brain activity.
View Article and Find Full Text PDFSensors (Basel)
December 2024
Faculty of Information Science and Technology, Beijing University of Technology, Beijing 100124, China.
With the increasing complexity of urban roads and rising traffic flow, traffic safety has become a critical societal concern. Current research primarily addresses drivers' attention, reaction speed, and perceptual abilities, but comprehensive assessments of cognitive abilities in complex traffic environments are lacking. This study, grounded in cognitive science and neuropsychology, identifies and quantitatively evaluates ten cognitive components related to driving decision-making, execution, and psychological states by analyzing video footage of drivers' actions.
View Article and Find Full Text PDFSci Rep
January 2025
The Higher Educational Key Laboratory for Flexible Manufacturing Equipment Integration of Fujian Province (Xiamen Institute of Technology), Xiamen, 361021, China.
With ongoing social progress, three-dimensional (3D) video is becoming increasingly prevalent in everyday life. As a key component of 3D video technology, depth video plays a crucial role by providing information about the distance and spatial distribution of objects within a scene. This study focuses on deep video encoding and proposes an efficient encoding method that integrates the Convolutional Neural Network (CNN) with a hyperautomation mechanism.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!