This paper presents the first robotic system featuring audio-visual (AV) sensor fusion with neuromorphic sensors. We combine a pair of silicon cochleae and a silicon retina on a robotic platform to allow the robot to learn sound localization through self motion and visual feedback, using an adaptive ITD-based sound localization algorithm. After training, the robot can localize sound sources (white or pink noise) in a reverberant environment with an RMS error of 4-5° in azimuth. We also investigate the AV source binding problem and an experiment is conducted to test the effectiveness of matching an audio event with a corresponding visual event based on their onset time. Despite the simplicity of this method and a large number of false visual events in the background, a correct match can be made 75% of the time during the experiment.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3274764 | PMC |
http://dx.doi.org/10.3389/fnins.2012.00021 | DOI Listing |
Sensors (Basel)
November 2024
Department of Computer Convergence Software, Korea University, Sejong 30019, Republic of Korea.
While the pig industry is crucial in global meat consumption, accounting for 34% of total consumption, respiratory diseases in pigs can cause substantial economic losses to pig farms. To alleviate this issue, we propose an advanced audio-visual monitoring system for the early detection of coughing, a key symptom of respiratory diseases in pigs, that will enhance disease management and animal welfare. The proposed system is structured into three key modules: the cough sound detection (CSD) module, which detects coughing sounds using audio data; the pig object detection (POD) module, which identifies individual pigs in video footage; and the coughing pig detection (CPD) module, which pinpoints which pigs are coughing among the detected pigs.
View Article and Find Full Text PDFSchizophr Bull Open
January 2024
Semel Institute for Neuroscience at UCLA and the VA Desert Pacific Mental Illness Research, Education and Clinical Center, Los Angeles, CA, USA.
There is a broad consensus that the commonly used clinician-administered rating scales for assessment of negative symptoms share significant limitations, including (1) reliance upon accurate self-report and recall from the patient and caregiver; (2) potential for sampling bias and thus being unrepresentative of daily-life experiences; (3) subjectivity of the symptom scoring process and limited sensitivity to change. These limitations led a work group from the International Society of CNS Clinical Trials and Methodology (ISCTM) to initiate the development of a multimodal negative symptom instrument. Experts from academia and industry reviewed the current methods of assessing the domains of negative symptoms including diminished (1) affect; (2) sociality; (3) verbal communication; (4) goal-directed behavior; and (5) Hedonic drives.
View Article and Find Full Text PDFSensors (Basel)
September 2024
School of Computing and Information Systems, Faculty of Science and Technology, Athabasca University, Athabasca, AB T9S 3A3, Canada.
Multimodal emotion classification (MEC) involves analyzing and identifying human emotions by integrating data from multiple sources, such as audio, video, and text. This approach leverages the complementary strengths of each modality to enhance the accuracy and robustness of emotion recognition systems. However, one significant challenge is effectively integrating these diverse data sources, each with unique characteristics and levels of noise.
View Article and Find Full Text PDFSensors (Basel)
September 2024
The Photonic Displays and Applications Group (GDAF), Electronic Technology Department, University Carlos III of Madrid, Calle Butarque 15, Leganés, 28911 Madrid, Spain.
This work aims to provide the hardware (HW) design of the optoelectronics interfaces for a visible-light communication (VLC) system that can be employed for several use cases. Potential applications include the transmission of ultra-high-definition (UHD) streaming video through existing reading lamps installed in passenger vans. In this use case, visible light is employed for the downlink, while infrared light is used for the uplink channel, acting as a remote controller.
View Article and Find Full Text PDFSensors (Basel)
June 2024
ITS Lab, Institute of Computer Science, University of Tartu, 51009 Tartu, Estonia.
Urban environments are undergoing significant transformations, with pedestrian areas emerging as complex hubs of diverse mobility modes. This shift demands a more nuanced approach to urban planning and navigation technologies, highlighting the limitations of traditional, road-centric datasets in capturing the detailed dynamics of pedestrian spaces. In response, we introduce the DELTA dataset, designed to improve the analysis and mapping of pedestrian zones, thereby filling the critical need for sidewalk-centric multimodal datasets.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!