1 1 1 1, Product of ExpertsPoE) 1. ) Harmonic-Temporal Clustering; HTC) [1], [] ) ) HTC Non-negative Matrix Factorization; NMF) [3] 1 Graduate School of Information Science and Technology, The University of Tokyo NTT NTT Comunication Science Labolatories, Nippon Telegraph and Telephone Coporation [4] [5], [6] [7], [8] Dirichlet Poisson N, Dir, Pois c 014 Information Processing Society of Japan 1
..1 [1] ) k n nθ k u) + φ a u) f k u) = N a u)e jnθ ku)+φ ) n=1 u φ f k u) W k x, t) ψ α,t u) ψ α,t u) = 1 ) u t ψ ) πα α 1) x t α ψu) 1 f k u) W k log 1 ) α, t = f k u), ψ α,t u) 3) F k ω) f k u) Fourier Ψ α,t ω) ψ α,t u) Fourier Parseval f k u), ψ α,t u) = F k ω), Ψ α,t ω) 4) ψu) Fourier Ψω) ) Fourier Ψ α,t ω) = Ψαω)e jωt 5) 4) W k log 1 ) α, t = F k ω)ψ αω)e jωt dω 6) a u) ã θ k u) θ k u) θ k u) = µ k u µ k µ k u) f k u) Fourier F k ω) = 1 N π = π n=1 ã e jn µ ku+φ ) e jωu du 7) N ã e jφ δω n µ k ) n=1 f k u) W k x, t) x = log 1 α W k x, t) = n a t)ψ ne x µ k t))e jnθ kt)+φ ) 8) Ψ ) ln ω) exp 4σ ω > 0) Ψω) = 9) 0 ω 0) Ω k t) = ln µ k t) W k x, t) W k x, t) = n a t)e x Ω k t) ln n) 4σ e jnθ kt)+φ ) 10) σ n, n n n ) ) W k x, t) W k x, t) n a t) e x Ω k t) ln n) σ 11) HTC Gaussian mixture model; GMM). NMF NMF a t) t n a t) = w U k t)/ πσ 1) W k x, t) W k x, t) = H k x, t)u k t) 13) H k x, t) := n w t) ln n) e x Ω k σ 14) πσ H k x, t) GMM k t U k t) w U w = 1 15) n w.4 HTC NMF c 014 Information Processing Society of Japan
H k NMF [10], [11] 14) Ω k t) [1], [13] U k t) GMM HTC[1], [] NMF HTC 3. 1.3 x l l = 1,..., L) t i i = 1,..., I) Y x l, t i ) HTC NMF 13) Xx l, t i ) = k H k x l, t i )U k t i ) 16) Xx l, t i ) 1) Y x l, t i ) Y x l, t i ) PoisY x l, t i ); Xx l, t i )) l, i) 17) Xx l, t i ) I Y X.4 Xx l, t i ) H k x l, t i ) 14) k, l, i H k x l, t i ) 16) NMF [9] H k x l, t i ) NMF [3] 3.1 [4] Products of Experts[14] 3. 3.1 3..1 1. 1) 3.1 k Ω k t i ) q g q s q g Ω k t i )) = N Ω k t i ); m k, ν k) 18) q c Ω k t i ) Ω k t i 1 )) = N Ω k t i ); Ω k t i 1 ), τ k ) 19) m k, ν k k τ k q c c 014 Information Processing Society of Japan 3
q g m k q g, q c Ω k t) pω k t i ) Ω k t i 1 )) q g Ω k t i )) αg q c Ω k t i ) Ω k t i 1 )) αc 0) α g, α c q g, q c 3.. U k t) C B k t i ) i B kt i ) = 1) A k k A k = 1) U k t i ) = CA k B k t i ), 1) A := [A 1,..., A K ] DirA; β) ) B k := [B k t 1 ),..., B k t I )] DirB k ; γ k ) 3) A β B k γ k 4. 4.1 3 Y := {Y x l, t i )} l,i MAP Θ = {w, Ω k t i ), A k, B k t i ), C} i argmax Θ ln pθ Y ) = argmax ln py Θ) + ln pθ)) Θ 4) ln pθ Y ) Θ ln py Θ) pθ) ln py Θ) ) = c Y x l, t i ) ln H k x l, t i )U k t) k k H k x l, t i )U k t) ) 5) ln pθ) = c α g ln q g Ω k t i )) + α c ln q s Ω k t i ) Ω k t i 1 ))) k,i + i ln DirA; β) + k ln DirB k ; γ k ) 6) = c 5) H k x l, t i )U k t i ),k = l,i, U k t i ) w exp x ) l Ω k t) ln n) πσ σ 7) 1 exp x Ω ) kt i ) ln n) πσ σ dx = 1 8) U k t i ) w exp x ) l Ω k t i ) ln n) πσ σ l,n U kt i ) X 0 9) X 0 4) arg max Θ JΘ) JΘ) Θ 5) k, n [15] JΘ) Θ JΘ) 5) Jensen ) Y x l, t i ) ln w ϕ x l, t i )U k t) Y x l, t i ) λ, U k t i ) ln w ϕ x l, t i ) λ, 30) λ, [0, 1] i, l, λ, = 1 λ,k = U k t i )w ϕ x l, t i ) n,k U kt i )w ϕ x l, t i ), 31) JΘ) J + Θ, {λ, } ) = c Y x l, t i ) λ, U k t i ) ln w ϕ x l, t i ) λ, U kt i ) X 0 + k,i + i α g ln q g Ω k t i )) + α c ln q s Ω k t i ) Ω k t i 1 ))) ln DirA; β) + k ln DirB k ; γ k ) 3) c 014 Information Processing Society of Japan 4
I- NMF 3) 4. 31) J + Θ, {λ,k, ζ, } ) 0 C Y C = X 0 Y x l, t i ) 33) l,i w Y x l, t i )λ, l,i,n Y x 34) l, t i )λ, αc Ω k τ D D + α ) 1 g ν E I + diag p l, ) m k α ) c ν 1 I + x l ln n) p l, 35) l,n l,i,n A k Y x l, t i )λ, + β k 1 k l,i,n Y x 36) l, t i )λ, + β k 1) l,n B k t i ) Y x l, t i )λ, + γ i,k 1 t l,n Y x 37) l, t i )λ, + γ i,k 1) diag 1 I I 1 E I I Ω k p l, Ω k := [Ω k t 1 ),..., Ω k t I )] 38) p l,,i = Y x l, t i )λ, σ 39) p l, := [p l,,1,..., p l,,i ] I I D 0 0 0 1 1 D = 1 1 40)...... 0 1 1 5. MATLAB 3 D,F,A ) Ω k t i ) 5.1 RWC [16] D F A 16 khz I- NMF 3, 100 ) [17] t i t i 1 = 16 ms [1] x 1 = ln55), x l x l 1 = ln)/10 N, K, τ, ν, σ, α g, α c ) = 8, 73, 10 4, 1.5, 10 4, 0.0, 1, 1) γ k = 1 3.96 10 6 )1 I β = 1.4 10 3 )1 K 10. 1). NMF. 3 5. D D β = 1.4 10 3 β = 1 3.0 10 3 5.1. 4 4 D ) 6. NMF HTC c 014 Information Processing Society of Japan 5
4 ), ) Product of Experts JSPS 6730100 [1] Hirokazu Kameoka, Statistical approach to multipitch analysis, Ph.D. thesis, University of Tokyo, 007. [] Hirokazu Kameoka, Takuya Nishimoto, Shigeki Sagayama, A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering, IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 3, pp. 98 994, Mar. 007. [3] Paris Smaragdis, and Judith C. Brown, Non-Negative Matrix Factorization for Polyphonic Music Transcription, In Proc. the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New York, USA, 003. [4] http://www.music-ir.org/mirex/wiki/mirex\_home [5] Paris. Smaragdis, and Gautham J. Mysore, Separation by humming : Userguided sound extraction from monophonic mixtures, In Proc. the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 69 7, 009. [6] Alexey Ozerov, Cédric Févotte, Raphaël Blouet, and Jean-Louis Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation, In Proc. the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 57 60, 011. [7] Romain Hennequin, Bertrand David, and Roland Badeau, Score informed audio source separation using a parametric model of non-negative spectrogram, In Proc. the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 45 48, 011. [8] Umut Simsekli, A. Taylan Cemgil, Score guided musical source separation using generalized coupled tensor factorization. In Proc. European Signal Processing Conference, pp. 639 643, 01. [9] Masahiro Nakano, Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, Shigeki Sagayama, Infinite- State Spectrum Model for Music Signal Analysis, In Proc. the IEEE International Conference on Acousitcs, Speech, and Signal Processing, 011. [10] Stanislaw A. Raczyński, Nobutaka Ono, and Shigeki Sagayama, Multipitch analysis with harmonic nonnegative matrix approximation, In Proc. The International society for Music Information Retrieval, pp. 381 386, 007. [11] Emmanuel Vincent, Nancy Bertin, and Roland Badeau, Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription, In Proc. The IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 109 11, 008. [1] Kazuyoshi Yoshii, and Masataka Goto, Infinite Latent Harmonic Allocation: A Nonparametric Bayesian Approach to Multipitch Analysis, In Proc. The International society for Music Information Retrieval, pp. 309 314, 010. [13], 75, 4T-8, 013. [14] Geoffrey E. Hinton, Training products of experts by minimizing contrastive divergence. Neural computation, 148), pp. 1771 1800, 00. [15],,,,, 006-MUS-66-13, pp. 77 84, Aug. 006. [16] Masataka Goto, Development of the RWC Music Database, In Proc. of the 18th International Congress on Acoustics ICA 004), pp. I-553 556, April 004. [17],,,,, 008-81898, 11, 008. c 014 Information Processing Society of Japan 6