Making a robot dance to diverse musical genre in noisy environments

In this paper we address the problem of musical genre recognition for a dancing robot with embedded microphones capable of distinguishing the genre of a musical piece while moving in a real-world scenario. For this purpose, we assess and compare two state-of-the-art musical genre recognition systems, based on Support Vector Machines and Markov Models, in the context of different real-world acoustic environments. In addition, we compare different preprocessing robot audition variants (single channel and separated signal from multiple channels) and test different acoustic models, learned a priori, to tackle multiple noise conditions of increasing complexity in the presence of noises of different natures (e.g., robot motion, speech). The results with six different musical genres suggest improved results, in the order of 43.6pp for the most complex conditions, when recurring to Sound Source Separation and acoustic models trained in similar conditions to the testing scenarios. A robot dance demonstration session confirms the applicability of the proposed integration for genre-adaptive dancing robots in real-world noisy environments.

[1]  George Tzanetakis,et al.  Improving automatic music tag annotation using stacked generalization of probabilistic SVM outputs , 2009, ACM Multimedia.

[2]  Olivier Romain,et al.  FPGA-based radio-on-demand broadcast receiver with musical genre identification , 2012, 2012 IEEE International Symposium on Industrial Electronics.

[3]  Radu Horaud,et al.  Sound-event recognition with a companion humanoid , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[4]  Joseph Bulbulia,et al.  Let’s Dance Together: Synchrony, Shared Intentionality and Cooperation , 2013, PloS one.

[5]  Luís Paulo Reis,et al.  Live assessment of beat tracking for robot audition , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Gonçalo Marques,et al.  A Music Classification Method based on Timbral Features , 2009, ISMIR.

[7]  Jun-ichi Imura,et al.  Whole Body Motion Noise Cancellation of a Robot for Improved Automatic Speech Recognition , 2011, Adv. Robotics.

[8]  Antonio Rubio,et al.  Speech Recognition Under Noise Conditions: Compensation Methods , 2007 .

[9]  Youngmoo E. Kim,et al.  Affective gesturing with music mood recognition , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[10]  Kazuhiro Nakadai,et al.  Blind Source Separation With Parameter-Free Adaptive Step-Size Method for Robot Audition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  N. Scaringella,et al.  Automatic genre classification of music content: a survey , 2006, IEEE Signal Processing Magazine.

[12]  Hiroaki Kitano,et al.  Robot recognizes three simultaneous speech by active audition , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[13]  Y. Song,et al.  A Survey of Music Recommendation Systems and Future Perspectives , 2012 .

[14]  Hyoung-Gook Kim,et al.  Car audio equalizer system using music classification and loudness compensation , 2011, ICTC 2011.

[15]  Manuela M. Veloso,et al.  Autonomous robot dancing driven by beats and emotions of music , 2012, AAMAS.