Query by Phrase (QBP) (Music Information Retrieval, MIR) QBH QBP / [1, 2] [3, 4] Query-by-Humming (QBH) QBP MIDI [5, 6] [8 10] [7]

Σχετικά έγγραφα
MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

Fourier transform, STFT 5. Continuous wavelet transform, CWT STFT STFT STFT STFT [1] CWT CWT CWT STFT [2 5] CWT STFT STFT CWT CWT. Griffin [8] CWT CWT

Buried Markov Model Pairwise

Non-negative Matrix Factorization, NMF [5] NMF. [1 3] Bregman [4] Harmonic-Temporal Clustering, HTC [2,3] 1,2,b) NTT

CSJ. Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

( ) (Harmonic-Temporal Clustering; HTC) [1], [2] ( ) ( ) [4] HTC. (Non-negative Matrix Factorization; NMF) [3] [5], [6] [7], [8]

ER-Tree (Extended R*-Tree)

Singing Information Processing: Music Information Processing for Singing Voices

Κεφάλαιο 1 Πραγματικοί Αριθμοί 1.1 Σύνολα

[5] F 16.1% MFCC NMF D-CASE 17 [5] NMF NMF 3. [5] 1 NMF Deep Neural Network(DNN) FUSION 3.1 NMF NMF [12] S W H 1 Fig. 1 Our aoustic event detect

The Algorithm to Extract Characteristic Chord Progression Extended the Sequential Pattern Mining

Development of a Seismic Data Analysis System for a Short-term Training for Researchers from Developing Countries

Technical Research Report, Earthquake Research Institute, the University of Tokyo, No. +-, pp. 0 +3,,**1. No ,**1

Applying Markov Decision Processes to Role-playing Game

Parameter Estimation of Mixture Model of Multiple Instruments and Application to Musical Instrument Identification

Re-Pair n. Re-Pair. Re-Pair. Re-Pair. Re-Pair. (Re-Merge) Re-Merge. Sekine [4, 5, 8] (highly repetitive text) [2] Re-Pair. Blocked-Repair-VF [7]

A Vocabulary-Free Infinity-Gram Model for Chord Progression Analysis

1 n-gram n-gram n-gram [11], [15] n-best [16] n-gram. n-gram. 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e)

GUI

ΑΡΧΙΜΗ ΗΣ - ΕΝΙΣΧΥΣΗ ΕΡΕΥΝΗΤΙΚΩΝ ΟΜΑ ΩΝ ΣΤΑ ΤΕΙ. Υποέργο: «Ανάκτηση και προστασία πνευµατικών δικαιωµάτων σε δεδοµένα

(Υπογραϕή) (Υπογραϕή) (Υπογραϕή)

Vol.4-DCC-8 No.8 Vol.4-MUS-5 No.8 4// 3 3 Hanning (T ) 3 Hanning 3T (y(t)w(t)) dt =.5 T y (t)dt. () STRAIGHT F 3 TANDEM-STRAIGHT[] 3 F F 3 [] F []. :

Retrieval of Seismic Data Recorded on Open-reel-type Magnetic Tapes (MT) by Using Existing Devices

Ανάλυση, Περιγραφή και Ανάκτηση Μουσικών Δεδομένων: το έργο ΠΟΛΥΜΝΙΑ*

Reading Order Detection for Text Layout Excluded by Image



Ανάκτηση Εικόνας βάσει Υφής με χρήση Eye Tracker

Twitter 6. DEIM Forum 2014 A Twitter,,, Wikipedia, Explicit Semantic Analysis,

Maxima SCORM. Algebraic Manipulations and Visualizing Graphs in SCORM contents by Maxima and Mashup Approach. Jia Yunpeng, 1 Takayuki Nagai, 2, 1

Automatic extraction of bibliography with machine learning

Δράσεις για την ενίσχυση της Δημιουργικότητας μέσω της Μουσικής Πληροφόρησης και της Τηλεκπαίδευσης στη Φιλαρμονική Ένωση Κέρκυρας «Καποδίστριας»

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

Stabilization of stock price prediction by cross entropy optimization

BandPass (4A) Young Won Lim 1/11/14

Ημερίδα διάχυσης αποτελεσμάτων έργου Ιωάννινα, 14/10/2015

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

Ανάκτηση Πληροφορίας. Διδάσκων: Φοίβος Μυλωνάς. Διάλεξη #03

Area Location and Recognition of Video Text Based on Depth Learning Method

Identifying Scenes with the Same Person in Video Content on the Basis of Scene Continuity and Face Similarity Measurement

Gemini, FastMap, Applications. Εαρινό Εξάμηνο Τμήμα Μηχανικών Η/Υ και Πληροϕορικής Πολυτεχνική Σχολή, Πανεπιστήμιο Πατρών

ΟΙΚΟΝΟΜΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ ΠΑΤΗΣΙΩΝ ΑΘΗΝΑ Ε - ΜΑΙL : mkap@aueb.gr ΤΗΛ: , ΚΑΠΕΤΗΣ ΧΡΥΣΟΣΤΟΜΟΣ. Βιογραφικό Σημείωμα

GPGPU. Grover. On Large Scale Simulation of Grover s Algorithm by Using GPGPU

Acoustic Signal Adjustment by Considering Musical Expressive Intention Using a Performance Intension Function

Συνδυασμένη Οπτική-Ακουστική Ανάλυση Ομιλίας

Anomaly Detection with Neighborhood Preservation Principle

A Method of Trajectory Tracking Control for Nonminimum Phase Continuous Time Systems

Wavelet based matrix compression for boundary integral equations on complex geometries

Probabilistic Approach to Robust Optimization

Estimation, Evaluation and Guarantee of the Reverberant Speech Recognition Performance based on Room Acoustic Parameters

SNR F0 [2], [3], [4] F0 F0 F0 F0 F0 TUSK F0 TUSK F0 6 TUSK 6 F0 2. F0 F0 [5] [6] [7] p[8] Cepstrum [9], [10] [11] [12] [13] F0 [14] F0 [15] DIO[16] [1

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

Καταλογοποίηση ακουστικών μουσικών δεδομένων

«ΕΡΕΥΝΑ ΓΙΑ ΤΗΝ ΗΛΕΚΤΡΟΝΙΚΗ ΠΡΟΩΘΗΣΗ ΜΟΥΣΙΚΟΥ ΥΛΙΚΟΥ ΑΝΕΞΑΡΤΗΤΩΝ ΚΑΛΛΙΤΕΧΝΩΝ»

Solving an Air Conditioning System Problem in an Embodiment Design Context Using Constraint Satisfaction Techniques

Indexing Methods for Encrypted Vector Databases

Η Διαδραστική Τηλεδιάσκεψη στο Σύγχρονο Σχολείο: Πλαίσιο Διδακτικού Σχεδιασμού

1530 ( ) 2014,54(12),, E (, 1, X ) [4],,, α, T α, β,, T β, c, P(T β 1 T α,α, β,c) 1 1,,X X F, X E F X E X F X F E X E 1 [1-2] , 2 : X X 1 X 2 ;

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ Δρ. ΣΩΤΗΡΙΟΣ Α. ΔΑΛΙΑΝΗΣ

SocialDict. A reading support tool with prediction capability and its extension to readability measurement

{takasu, Conditional Random Field

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

!! " # $%&'() * & +(&( 2010

: Active Learning 2017/11/12

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

MachineDancing: MikuMikuDance (MMD) *1 MMD MMD. Kinect. MachineDancing. 3 MachineDancing. 1 MachineDancing :

Ανάκτηση Πληροφορίας Εισαγωγή

Heterobimetallic Pd-Sn Catalysis: Michael Addition. Reaction with C-, N-, O-, S- Nucleophiles and In-situ. Diagnostics

Sampling Basics (1B) Young Won Lim 9/21/13

2 ~ 8 Hz Hz. Blondet 1 Trombetti 2-4 Symans 5. = - M p. M p. s 2 x p. s 2 x t x t. + C p. sx p. + K p. x p. C p. s 2. x tp x t.

Toward a SPARQL Query Execution Mechanism using Dynamic Mapping Adaptation -A Preliminary Report- Takuya Adachi 1 Naoki Fukuta 2.


Motion analysis and simulation of a stratospheric airship

1181 (real-timespeechdriven) 1 1 ( ) D FAP FAP (voiceactivationdetectionvad) D FaceGen 3- D XfaceEd MPEG-4 1 FAP 66 FAP ( ) FAP 84

Web DEIM Forum 2009 A7-1. Web. Web. Web. Web. 4 Wikipedia. Wikipedia. Web.


ΕΥΡΕΣΗ ΤΟΥ ΔΙΑΝΥΣΜΑΤΟΣ ΘΕΣΗΣ ΚΙΝΟΥΜΕΝΟΥ ΡΟΜΠΟΤ ΜΕ ΜΟΝΟΦΘΑΛΜΟ ΣΥΣΤΗΜΑ ΟΡΑΣΗΣ

Coupling of a Jet-Slot Oscillator With the Flow-Supply Duct: Flow-Acoustic Interaction Modeling

Η ΕΠΙΣΤΗΜΗ ΤΗΣ ΠΛΗΡΟΦΟΡΗΣΗΣ ΣΤΟ ΣΥΓΧΡΟΝΟ ΠΕΡΙΒΑΛΛΟΝ

Detection and Recognition of Traffic Signal Using Machine Learning

ΓΙΑΝΝΟΥΛΑ Σ. ΦΛΩΡΟΥ Ι ΑΚΤΟΡΑΣ ΤΟΥ ΤΜΗΜΑΤΟΣ ΕΦΑΡΜΟΣΜΕΝΗΣ ΠΛΗΡΟΦΟΡΙΚΗΣ ΤΟΥ ΠΑΝΕΠΙΣΤΗΜΙΟΥ ΜΑΚΕ ΟΝΙΑΣ ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ

Audio Engineering Society. Convention Paper. Presented at the 120th Convention 2006 May Paris, France

CAP A CAP

Company. Patras, Greece

Wiki. Wiki. Analysis of user activity of closed Wiki used by small groups

Introduction to Bioinformatics

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ

Scrub Nurse Robot: SNR. C++ SNR Uppaal TA SNR SNR. Vain SNR. Uppaal TA. TA state Uppaal TA location. Uppaal

Τμήμα Επιστήμης Υπολογιστών ΗΥ-474. Ψηφιακός ήχος. Χαρακτηριστικά σήματος ήχου Ψηφιοποίηση ήχου Συνθετικοί ήχοι MIDI

Yahoo 2. SNS Social Networking Service [3,5,12] Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited.

Εξάλειψη αντήχησης από ηχητικά σήματα με υποκειμενικά / ψυχοακουστικά κριτήρια

Assignment 1 Solutions Complex Sinusoids

Elements of Information Theory

Ερευνητική+Ομάδα+Τεχνολογιών+ Διαδικτύου+

Εισαγωγή στην επιστήμη των υπολογιστών. Υπολογιστές και Δεδομένα Κεφάλαιο 2ο Αναπαράσταση Δεδομένων

40 3 Journal of South China University of Technology Vol. 40 No Natural Science Edition March

ΣΤΟΙΧΕΙΑ ΠΡΟΤΕΙΝΟΜΕΝΟΥ ΕΞΩΤΕΡΙΚΟΥ ΕΜΠΕΙΡΟΓΝΩΜΟΝΟΣ Προσωπικά Στοιχεία:

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

Transcript:

Query by Phrase: a 2 2 Query by Phrase QBP QBP GaP-NMF GaP-NMF GaP-NMF QBP. Music Information Retrieval MIR [ 2] [3 4]Query-by-Humming QBH MIDI [5 6] [7] Waseda University 2 National Institute of Advanced Industrial Science and Technology AIST a masutaro@suou.waseda.jp Query by Phrase QBP QBH QBP / QBP [8 0] CD c 204 Information Processing Society of Japan

Query phrase Similarity Location of the query Musical piece Time Query-by-Phrase. QBP GaP-NMF GaP-NMF GaP-NMF 2 NMF 3 2 4 QBP 5 2. NMF 2. X R M Nx Y R M Ny 3 X NMF W x H x Y W x W f NMF W f W x W f H y H f H x H y NMF 2 X Y NMF 2 H y H f W x W f Y W f Y Y W x 2.2 NMF NMF GaP-NMF [] X θ R Kx 2 W x R M Kx H x R Kx Nx X : K x X mn θ W x m Hx. = θ W x m m H x n W x H x c 204 Information Processing Society of Japan 2

2.3 NMF Y NMF W x W Kirchhoff [2] W K x Y W x NMF H y H f H y H f 2.4 NMF H x H y n H x H y rn : rn = K x N x h i = h = N x K x = T h x hx h y hy h x hx h y hy 2 [ H i H i+n x ] T 3 N x j= H +j [ ]T. 4 : rn > µ + 4σ. 5 µ σ rn 2.5 GaP-NMF V R M N NMF Hoffman [] θ R K W R M K H R K N : pw m = Gamma a W b W ph = Gamma a H b H 6 α pθ = Gamma K αc. α K c V c = MN m n V mn GIG : qw m = GIG qh = GIG qθ = GIG γ W m ρw γ H ρh γ θ ρθ m τ W m τ H τ θ. 7 ϕ mn ω mn 2 ϕ mn = E q [ ω mn = θ W m H ] 8 E q [θ W m H ]. 9 ϕ mn ω mn GIG γ W m = aw ρ W m = bw + E q [θ ] n [ ] = E q τ W m θ n E q [H ] ω mn V mn ϕ 2 mne q [ H γ H = ah ρ H = bh + E q [θ ] m [ ] τ H = E q θ m ] 0 E q [W m ] ω mn V mn ϕ 2 mne q [ W m γ θ = α K ρθ = αc + m n τ θ = [ V mn ϕ 2 mne q m n ] E q [W m H ] ω mn W m H ]. 2 8 9 W H θ c 204 Information Processing Society of Japan 3

K + E q [θ ] 0 E q [θ ] E q[θ ] 60 db K + 3. 2 3. IS 3.2 3. MFCC MFCC Auditory Toolbox Version 2 [3] 2 2 m s m s 3.2 DP X Y IS IS KL IS IS DP D R Nx Ny D Di j X i Y j IS i N x j N y Di j : Di j = D IS X i Y j = m log X mi + X mi. Y mj Y mj 3 m E R Nx Ny E j E j = 0 i Ei = Ei j : Ei j 2 + 2Di j Ei j = min 2 Ei j + Di j 3 Ei 2 j + 2Di j +Di j. 4 j EN x j C R Nx Ny 3 3 : Ci j 2 + 3 Ci j = 2 Ci j + 2 3 Ci 2 j + 3 C 0 5 2 ENxj CN xj M S/5 M S S 0 6 ENxj CN xj 03 0 4 4. 2 3. 3.2 Query-by-Phrase QBP 4. 2 3 : c 204 Information Processing Society of Japan 4

Precision % Recall % F-measure % MFCC 24.8 35.0 29.0 DP 0 0 0 43.2 88.0 57.9 2 Precision % Recall % F-measure % MFCC 0 0 0 DP 2. 2.7 3.9 26.9 56.7 36.5 3 20% Precision % Recall % F-measure % MFCC 0 0 0 DP 0 0 0 5.8 45.0 23.4 exact-match 2 3 RWC [4] 4 No. No.9 No.42 No.77 50 : 4 0 2 30 3 20% 0 4 9 6 Hz 0 ms STFT 3.75 ms 60 0 cent 900 0720 cent NMF α = K = 00 a W x = b W x = a Hx = 0. b Hx = c NMF a W x = b W x = a W f = b W f = 0. a Hy = 0 a Hf = 0.0 b Hy = b Hf = c b H 2 rn RWC-MDB- P-200 No.42. a. b. c 20%. a QBP F precision rate recall rate P R F F F = 2P R P +R 4.2 3 3 QBP 2 rn MFCC exact-match DP IS c 204 Information Processing Society of Japan 5

3 * 5. Query-by-Phrase QBP NMF NMF JST CREST OngaCREST [] Li T. and Ogihara M.: Content-based Music Similarity Search and Emotion Detection International Conference on Acoustics Speech and Signal Processing I- CASSP 5:705 708 2004. [2] Logan B. and Salamon A.: A Music Similarity Function Based on Signal Analysis International Conference on Multimedia and Expo ICME pp. 745 748 200. [3] Ramona M. and Peeters G: AudioPrint: An efficient audio fingerprint system based on a novel cost-less synchronization scheme International Conference on Acoustics Speech and Signal Processing ICASSP pp. 88 822 203. [4] Haitsma J. and Kaler T.: A Highly Robust Audio Fingerprinting System International Conference on Music Information Retrieval ISMIR pp. 07 5 2002. [5] McNab R. J. Smith L. A. Witten I. H. Henderson C. L. and Cunningham S. J.: Towards the Digital Music Library: Tune Retrieval from Acoustic Input in Proceedings of the first ACM international conference on Digital libraries pp. 8 996. [6] Ghias A. Logan J. Chamberlin D. and Smith B. C.: Query By Humming: Musical Information Retrieval in an Audio Database Proceedings of the third ACM international conference on Multimedia pp. 23 236 995. [7] Nishimura T. Hashiguchi H. Taita J. Zhang J. X. Goto M. and Oa R.: Music Signal Spotting Retrieval by a Humming Query Using Start Frame Feature Dependent Continuous Dynamic Programming International Conference on Music Information Retrieval ISMIR pp. 2 28 200. [8] Kameoa H. Ochiai K. Naano M. Tsuchiya M. and Sagayama S.: Context-free 2D Tree Structure Model of Musical Notes for Bayesian Modeling of Polyphonic Spectrograms International Conference on Music Information Retrieval ISMIR pp. 307 32 202. [9] Grindlay G. and Ellis D. P. W.: A Probabilistic Subspace Model for Multi-instrument Polyphonic Transcription International Conference on Music Information Retrieval ISMIR pp. 2 26 200. [0] Ryynänen M. and Klapuri A.: Automatic Bass Line Transcription from Streaming Polyphonic Audio International Conference on Acoustics Speech and Signal Processing ICASSP 4:437 440 2007. [] Hoffman M. D. Blei D. M. and Coo P. R.: Bayesian Nonparametric Matrix Factorization for Recorded Music International Conference on Machine Learning ICML pp. 439 446 200. [2] Kirchhoff H. Dixon S. and Klapuri A.: Shift-variant Non-negative Matrix Deconvolution for Music Transcription International Conference on Acoustics Speech and Signal Processing ICASSP pp. 25 28 202. [3] Slaney M.: Auditory Toolbox Version 2 Technical Report #998-00 Interval Research Corporation 998. [4] Goto M. Hashiguchi H. Nishimura T. and Oa R.: R- WC Music Database: Popular Classical and Jazz Music Databases International Conference on Music Information Retrieval ISMIR pp. 287 288 2002. A. Gammax a b = Γx: ba Γa xa e bx GIGx a b c = b/c a 2 x a 2K a 2 bc e bx+ c x K ν x: 2 A. A.2 * c 204 Information Processing Society of Japan 6