Neuromorphic audio-visual sensor fusion on a sound-localizing robot.

Front Neurosci

School of Electrical and Information Engineering, The University of Sydney Sydney, NSW, Australia.

Published: October 2012

This paper presents the first robotic system featuring audio-visual (AV) sensor fusion with neuromorphic sensors. We combine a pair of silicon cochleae and a silicon retina on a robotic platform to allow the robot to learn sound localization through self motion and visual feedback, using an adaptive ITD-based sound localization algorithm. After training, the robot can localize sound sources (white or pink noise) in a reverberant environment with an RMS error of 4-5° in azimuth. We also investigate the AV source binding problem and an experiment is conducted to test the effectiveness of matching an audio event with a corresponding visual event based on their onset time. Despite the simplicity of this method and a large number of false visual events in the background, a correct match can be made 75% of the time during the experiment.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3274764PMC
http://dx.doi.org/10.3389/fnins.2012.00021DOI Listing

Publication Analysis

Top Keywords

audio-visual sensor
8
sensor fusion
8
sound localization
8
neuromorphic audio-visual
4
fusion sound-localizing
4
sound-localizing robot
4
robot paper
4
paper presents
4
presents robotic
4
robotic system
4

Similar Publications

While the pig industry is crucial in global meat consumption, accounting for 34% of total consumption, respiratory diseases in pigs can cause substantial economic losses to pig farms. To alleviate this issue, we propose an advanced audio-visual monitoring system for the early detection of coughing, a key symptom of respiratory diseases in pigs, that will enhance disease management and animal welfare. The proposed system is structured into three key modules: the cough sound detection (CSD) module, which detects coughing sounds using audio data; the pig object detection (POD) module, which identifies individual pigs in video footage; and the coughing pig detection (CPD) module, which pinpoints which pigs are coughing among the detected pigs.

View Article and Find Full Text PDF

Rationale and Challenges for a New Instrument for Remote Measurement of Negative Symptoms.

Schizophr Bull Open

January 2024

Semel Institute for Neuroscience at UCLA and the VA Desert Pacific Mental Illness Research, Education and Clinical Center, Los Angeles, CA, USA.

There is a broad consensus that the commonly used clinician-administered rating scales for assessment of negative symptoms share significant limitations, including (1) reliance upon accurate self-report and recall from the patient and caregiver; (2) potential for sampling bias and thus being unrepresentative of daily-life experiences; (3) subjectivity of the symptom scoring process and limited sensitivity to change. These limitations led a work group from the International Society of CNS Clinical Trials and Methodology (ISCTM) to initiate the development of a multimodal negative symptom instrument. Experts from academia and industry reviewed the current methods of assessing the domains of negative symptoms including diminished (1) affect; (2) sociality; (3) verbal communication; (4) goal-directed behavior; and (5) Hedonic drives.

View Article and Find Full Text PDF

AVaTER: Fusing Audio, Visual, and Textual Modalities Using Cross-Modal Attention for Emotion Recognition.

Sensors (Basel)

September 2024

School of Computing and Information Systems, Faculty of Science and Technology, Athabasca University, Athabasca, AB T9S 3A3, Canada.

Multimodal emotion classification (MEC) involves analyzing and identifying human emotions by integrating data from multiple sources, such as audio, video, and text. This approach leverages the complementary strengths of each modality to enhance the accuracy and robustness of emotion recognition systems. However, one significant challenge is effectively integrating these diverse data sources, each with unique characteristics and levels of noise.

View Article and Find Full Text PDF

This work aims to provide the hardware (HW) design of the optoelectronics interfaces for a visible-light communication (VLC) system that can be employed for several use cases. Potential applications include the transmission of ultra-high-definition (UHD) streaming video through existing reading lamps installed in passenger vans. In this use case, visible light is employed for the downlink, while infrared light is used for the uplink channel, acting as a remote controller.

View Article and Find Full Text PDF

Urban environments are undergoing significant transformations, with pedestrian areas emerging as complex hubs of diverse mobility modes. This shift demands a more nuanced approach to urban planning and navigation technologies, highlighting the limitations of traditional, road-centric datasets in capturing the detailed dynamics of pedestrian spaces. In response, we introduce the DELTA dataset, designed to improve the analysis and mapping of pedestrian zones, thereby filling the critical need for sidewalk-centric multimodal datasets.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!