Query by Phrase (QBP) (Music Information Retrieval, MIR) QBH QBP / [1, 2] [3, 4] Query-by-Humming (QBH) QBP MIDI [5, 6] [8 10] [7]

Query by Phrase: a 2 2 Query by Phrase QBP QBP GaP-NMF GaP-NMF GaP-NMF QBP. Music Information Retrieval MIR [ 2] [3 4]Query-by-Humming QBH MIDI [5 6] [7] Waseda University 2 National Institute of Advanced Industrial Science and Technology AIST a masutaro@suou.waseda.jp Query by Phrase QBP QBH QBP / QBP [8 0] CD c 204 Information Processing Society of Japan

Query phrase Similarity Location of the query Musical piece Time Query-by-Phrase. QBP GaP-NMF GaP-NMF GaP-NMF 2 NMF 3 2 4 QBP 5 2. NMF 2. X R M Nx Y R M Ny 3 X NMF W x H x Y W x W f NMF W f W x W f H y H f H x H y NMF 2 X Y NMF 2 H y H f W x W f Y W f Y Y W x 2.2 NMF NMF GaP-NMF [] X θ R Kx 2 W x R M Kx H x R Kx Nx X : K x X mn θ W x m Hx. = θ W x m m H x n W x H x c 204 Information Processing Society of Japan 2

2.3 NMF Y NMF W x W Kirchhoff [2] W K x Y W x NMF H y H f H y H f 2.4 NMF H x H y n H x H y rn : rn = K x N x h i = h = N x K x = T h x hx h y hy h x hx h y hy 2 [ H i H i+n x ] T 3 N x j= H +j [ ]T. 4 : rn > µ + 4σ. 5 µ σ rn 2.5 GaP-NMF V R M N NMF Hoffman [] θ R K W R M K H R K N : pw m = Gamma a W b W ph = Gamma a H b H 6 α pθ = Gamma K αc. α K c V c = MN m n V mn GIG : qw m = GIG qh = GIG qθ = GIG γ W m ρw γ H ρh γ θ ρθ m τ W m τ H τ θ. 7 ϕ mn ω mn 2 ϕ mn = E q [ ω mn = θ W m H ] 8 E q [θ W m H ]. 9 ϕ mn ω mn GIG γ W m = aw ρ W m = bw + E q [θ ] n [ ] = E q τ W m θ n E q [H ] ω mn V mn ϕ 2 mne q [ H γ H = ah ρ H = bh + E q [θ ] m [ ] τ H = E q θ m ] 0 E q [W m ] ω mn V mn ϕ 2 mne q [ W m γ θ = α K ρθ = αc + m n τ θ = [ V mn ϕ 2 mne q m n ] E q [W m H ] ω mn W m H ]. 2 8 9 W H θ c 204 Information Processing Society of Japan 3

K + E q [θ ] 0 E q [θ ] E q[θ ] 60 db K + 3. 2 3. IS 3.2 3. MFCC MFCC Auditory Toolbox Version 2 [3] 2 2 m s m s 3.2 DP X Y IS IS KL IS IS DP D R Nx Ny D Di j X i Y j IS i N x j N y Di j : Di j = D IS X i Y j = m log X mi + X mi. Y mj Y mj 3 m E R Nx Ny E j E j = 0 i Ei = Ei j : Ei j 2 + 2Di j Ei j = min 2 Ei j + Di j 3 Ei 2 j + 2Di j +Di j. 4 j EN x j C R Nx Ny 3 3 : Ci j 2 + 3 Ci j = 2 Ci j + 2 3 Ci 2 j + 3 C 0 5 2 ENxj CN xj M S/5 M S S 0 6 ENxj CN xj 03 0 4 4. 2 3. 3.2 Query-by-Phrase QBP 4. 2 3 : c 204 Information Processing Society of Japan 4

Precision % Recall % F-measure % MFCC 24.8 35.0 29.0 DP 0 0 0 43.2 88.0 57.9 2 Precision % Recall % F-measure % MFCC 0 0 0 DP 2. 2.7 3.9 26.9 56.7 36.5 3 20% Precision % Recall % F-measure % MFCC 0 0 0 DP 0 0 0 5.8 45.0 23.4 exact-match 2 3 RWC [4] 4 No. No.9 No.42 No.77 50 : 4 0 2 30 3 20% 0 4 9 6 Hz 0 ms STFT 3.75 ms 60 0 cent 900 0720 cent NMF α = K = 00 a W x = b W x = a Hx = 0. b Hx = c NMF a W x = b W x = a W f = b W f = 0. a Hy = 0 a Hf = 0.0 b Hy = b Hf = c b H 2 rn RWC-MDB- P-200 No.42. a. b. c 20%. a QBP F precision rate recall rate P R F F F = 2P R P +R 4.2 3 3 QBP 2 rn MFCC exact-match DP IS c 204 Information Processing Society of Japan 5

3 * 5. Query-by-Phrase QBP NMF NMF JST CREST OngaCREST [] Li T. and Ogihara M.: Content-based Music Similarity Search and Emotion Detection International Conference on Acoustics Speech and Signal Processing I- CASSP 5:705 708 2004. [2] Logan B. and Salamon A.: A Music Similarity Function Based on Signal Analysis International Conference on Multimedia and Expo ICME pp. 745 748 200. [3] Ramona M. and Peeters G: AudioPrint: An efficient audio fingerprint system based on a novel cost-less synchronization scheme International Conference on Acoustics Speech and Signal Processing ICASSP pp. 88 822 203. [4] Haitsma J. and Kaler T.: A Highly Robust Audio Fingerprinting System International Conference on Music Information Retrieval ISMIR pp. 07 5 2002. [5] McNab R. J. Smith L. A. Witten I. H. Henderson C. L. and Cunningham S. J.: Towards the Digital Music Library: Tune Retrieval from Acoustic Input in Proceedings of the first ACM international conference on Digital libraries pp. 8 996. [6] Ghias A. Logan J. Chamberlin D. and Smith B. C.: Query By Humming: Musical Information Retrieval in an Audio Database Proceedings of the third ACM international conference on Multimedia pp. 23 236 995. [7] Nishimura T. Hashiguchi H. Taita J. Zhang J. X. Goto M. and Oa R.: Music Signal Spotting Retrieval by a Humming Query Using Start Frame Feature Dependent Continuous Dynamic Programming International Conference on Music Information Retrieval ISMIR pp. 2 28 200. [8] Kameoa H. Ochiai K. Naano M. Tsuchiya M. and Sagayama S.: Context-free 2D Tree Structure Model of Musical Notes for Bayesian Modeling of Polyphonic Spectrograms International Conference on Music Information Retrieval ISMIR pp. 307 32 202. [9] Grindlay G. and Ellis D. P. W.: A Probabilistic Subspace Model for Multi-instrument Polyphonic Transcription International Conference on Music Information Retrieval ISMIR pp. 2 26 200. [0] Ryynänen M. and Klapuri A.: Automatic Bass Line Transcription from Streaming Polyphonic Audio International Conference on Acoustics Speech and Signal Processing ICASSP 4:437 440 2007. [] Hoffman M. D. Blei D. M. and Coo P. R.: Bayesian Nonparametric Matrix Factorization for Recorded Music International Conference on Machine Learning ICML pp. 439 446 200. [2] Kirchhoff H. Dixon S. and Klapuri A.: Shift-variant Non-negative Matrix Deconvolution for Music Transcription International Conference on Acoustics Speech and Signal Processing ICASSP pp. 25 28 202. [3] Slaney M.: Auditory Toolbox Version 2 Technical Report #998-00 Interval Research Corporation 998. [4] Goto M. Hashiguchi H. Nishimura T. and Oa R.: R- WC Music Database: Popular Classical and Jazz Music Databases International Conference on Music Information Retrieval ISMIR pp. 287 288 2002. A. Gammax a b = Γx: ba Γa xa e bx GIGx a b c = b/c a 2 x a 2K a 2 bc e bx+ c x K ν x: 2 A. A.2 * c 204 Information Processing Society of Japan 6