Robot gaining robust pouring skills through fusing vision and audio.

ISA Trans

School of Control Science and Engineering, Shandong University, 17923 Jingshi Road, Jinan, China.

Published: April 2023

In the pouring task of service robots, the robust and accurate estimate of liquid height is a crucial step. However, neither vision nor audio alone can achieve better liquid height estimation. We instead propose a visual-audio information fusion network to enable robots with good pouring skills. Visual and audio information are used as information sources. Firstly, visual features are extracted by residual network based on attention model. Secondly, the Fourier characteristic matrix of audio information is obtained by fast Fourier transform, and then the audio feature is extracted by long-short term memory. Thirdly, visual features and audio features are fused by fully connected network to output the liquid height and state of the cup. Finally, a sinusoidal and transient fusion control method is proposed, which takes the liquid height and cup state as inputs, outputs the angle of the gripper, and provides an implementation method for the pouring task. Experiments are carried out to evaluate the performance of multimodal information fusion method and verify the effectiveness of the algorithm for pouring tasks of service robots.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.isatra.2022.09.022	DOI Listing

Publication Analysis

Top Keywords

liquid height

pouring skills

vision audio

pouring task

service robots

visual features

audio

pouring

robot gaining

gaining robust

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!