Robust temporal and spectral modeling for query By melody

Query by melody is the problem of retrieving musical performances from melodies. Retrieval of real performances is complicated due to the large number of variations in performing a melody and the presence of colored accompaniment noise. We describe a simple yet effective probabilistic model for this task. We describe a generative model that is rich enough to capture the spectral and temporal variations of musical performances and allows for tractable melody retrieval. While most of previous studies on music retrieval from melodies were performed with either symbolic (e.g. MIDI) data or with monophonic (single instrument) performances, we performed experiments in retrieving live and studio recordings of operas that contain a leading vocalist and rich instrumental accompaniment. Our results show that the probabilistic approach we propose is effective and can be scaled to massive datasets.

[1]  David Sankoff,et al.  Comparison of musical sequences , 1990, Comput. Humanit..

[2]  Mark Sandler,et al.  Segmentation of Musical Signals Using Hidden Markov Models. , 2001 .

[3]  Roger B. Dannenberg,et al.  An On-Line Algorithm for Real-Time Accompaniment , 1984, ICMC.

[4]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[5]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[6]  Jorma Tarhio,et al.  Searching monophonic patterns within polyphonic sources , 2000 .

[7]  Christopher Raphael,et al.  Automatic Segmentation of Acoustic Musical Signals Using Hidden Markov Models , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[9]  Simon J. Godsill,et al.  Polyphonic pitch tracking using joint Bayesian estimation of multiple frame parameters , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[10]  Shlomo Dubnov,et al.  Speech enhancement by harmonic modeling via map pitch tracking , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Adriane Durey,et al.  Melody Spotting Using Hidden Markov Models , 2001, ISMIR.

[12]  George Tzanetakis,et al.  Audio Information Retrieval (AIR) Tools , 2000, ISMIR.

[13]  Ian H. Witten,et al.  The New Zealand Digital Library MELody inDEX , 1997, D Lib Mag..