Singing Information Processing: Music Information Processing for Singing Voices

Σχετικά έγγραφα
MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

Query by Phrase (QBP) (Music Information Retrieval, MIR) QBH QBP / [1, 2] [3, 4] Query-by-Humming (QBH) QBP MIDI [5, 6] [8 10] [7]

Vol.4-DCC-8 No.8 Vol.4-MUS-5 No.8 4// 3 3 Hanning (T ) 3 Hanning 3T (y(t)w(t)) dt =.5 T y (t)dt. () STRAIGHT F 3 TANDEM-STRAIGHT[] 3 F F 3 [] F []. :

1,a) 1,b) 2 3 Sakriani Sakti 1 Graham Neubig 1 1. A Study on HMM-Based Speech Synthesis Using Rich Context Models

Buried Markov Model Pairwise

ER-Tree (Extended R*-Tree)

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

SNR F0 [2], [3], [4] F0 F0 F0 F0 F0 TUSK F0 TUSK F0 6 TUSK 6 F0 2. F0 F0 [5] [6] [7] p[8] Cepstrum [9], [10] [11] [12] [13] F0 [14] F0 [15] DIO[16] [1

Fourier transform, STFT 5. Continuous wavelet transform, CWT STFT STFT STFT STFT [1] CWT CWT CWT STFT [2 5] CWT STFT STFT CWT CWT. Griffin [8] CWT CWT

CSJ. Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity

Signal processing for handling singing voice texture

Estimation, Evaluation and Guarantee of the Reverberant Speech Recognition Performance based on Room Acoustic Parameters

Sinsy: HMM. Sinsy An HMM-based singing voice synthesis system which can realize your wish I want this person to sing my song

GUI

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

Parameter Estimation of Mixture Model of Multiple Instruments and Application to Musical Instrument Identification

Ανάλυση, Περιγραφή και Ανάκτηση Μουσικών Δεδομένων: το έργο ΠΟΛΥΜΝΙΑ*

: TANDEM-STRAIGHT. Make singing voice tangible: TANDEM-STRAIGHT and temporally variable morphing as substrate. Hideki Kawahara 1 and Masanori Morise 2

[5] F 16.1% MFCC NMF D-CASE 17 [5] NMF NMF 3. [5] 1 NMF Deep Neural Network(DNN) FUSION 3.1 NMF NMF [12] S W H 1 Fig. 1 Our aoustic event detect

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

Automatic extraction of bibliography with machine learning

VOCODER VOCODER Vocal

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ

Ψηφιακή Επεξεργασία Φωνής

Re-Pair n. Re-Pair. Re-Pair. Re-Pair. Re-Pair. (Re-Merge) Re-Merge. Sekine [4, 5, 8] (highly repetitive text) [2] Re-Pair. Blocked-Repair-VF [7]

The Algorithm to Extract Characteristic Chord Progression Extended the Sequential Pattern Mining

Gemini, FastMap, Applications. Εαρινό Εξάμηνο Τμήμα Μηχανικών Η/Υ και Πληροϕορικής Πολυτεχνική Σχολή, Πανεπιστήμιο Πατρών

Identifying Scenes with the Same Person in Video Content on the Basis of Scene Continuity and Face Similarity Measurement

Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb

Area Location and Recognition of Video Text Based on Depth Learning Method

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

Maxima SCORM. Algebraic Manipulations and Visualizing Graphs in SCORM contents by Maxima and Mashup Approach. Jia Yunpeng, 1 Takayuki Nagai, 2, 1

Ανάκτηση Εικόνας βάσει Υφής με χρήση Eye Tracker

Wiki. Wiki. Analysis of user activity of closed Wiki used by small groups

Toward a SPARQL Query Execution Mechanism using Dynamic Mapping Adaptation -A Preliminary Report- Takuya Adachi 1 Naoki Fukuta 2.

Ερευνητική+Ομάδα+Τεχνολογιών+ Διαδικτύου+

Development of a Seismic Data Analysis System for a Short-term Training for Researchers from Developing Countries

Detection and Recognition of Traffic Signal Using Machine Learning

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

Quick algorithm f or computing core attribute

1181 (real-timespeechdriven) 1 1 ( ) D FAP FAP (voiceactivationdetectionvad) D FaceGen 3- D XfaceEd MPEG-4 1 FAP 66 FAP ( ) FAP 84

Χρηματοοικονομική Ανάπτυξη, Θεσμοί και

Πανεπιστήµιο Πειραιώς Τµήµα Πληροφορικής

Συνδυασμένη Οπτική-Ακουστική Ανάλυση Ομιλίας

ΜΕΤΑΠΤΥΧΙΑΚΟ ΠΡΟΓΡΑΜΜΑ ΣΠΟΥΔΩΝ

1 n-gram n-gram n-gram [11], [15] n-best [16] n-gram. n-gram. 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e)

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

VSC STEADY2STATE MOD EL AND ITS NONL INEAR CONTROL OF VSC2HVDC SYSTEM VSC (1. , ; 2. , )

SocialDict. A reading support tool with prediction capability and its extension to readability measurement

Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

ΕΥΡΕΣΗ ΤΟΥ ΔΙΑΝΥΣΜΑΤΟΣ ΘΕΣΗΣ ΚΙΝΟΥΜΕΝΟΥ ΡΟΜΠΟΤ ΜΕ ΜΟΝΟΦΘΑΛΜΟ ΣΥΣΤΗΜΑ ΟΡΑΣΗΣ

HOSVD. Higher Order Data Classification Method with Autocorrelation Matrix Correcting on HOSVD. Junichi MORIGAKI and Kaoru KATAYAMA

{takasu, Conditional Random Field

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

Acoustic Signal Adjustment by Considering Musical Expressive Intention Using a Performance Intension Function

MachineDancing: MikuMikuDance (MMD) *1 MMD MMD. Kinect. MachineDancing. 3 MachineDancing. 1 MachineDancing :

Vocal Dynamics Controller:

Η ΠΡΟΣΩΠΙΚΗ ΟΡΙΟΘΕΤΗΣΗ ΤΟΥ ΧΩΡΟΥ Η ΠΕΡΙΠΤΩΣΗ ΤΩΝ CHAT ROOMS

Retrieval of Seismic Data Recorded on Open-reel-type Magnetic Tapes (MT) by Using Existing Devices

(Υπογραϕή) (Υπογραϕή) (Υπογραϕή)


Reading Order Detection for Text Layout Excluded by Image

ΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ. ΘΕΜΑ: «ιερεύνηση της σχέσης µεταξύ φωνηµικής επίγνωσης και ορθογραφικής δεξιότητας σε παιδιά προσχολικής ηλικίας»


ΓΙΑΝΝΟΥΛΑ Σ. ΦΛΩΡΟΥ Ι ΑΚΤΟΡΑΣ ΤΟΥ ΤΜΗΜΑΤΟΣ ΕΦΑΡΜΟΣΜΕΝΗΣ ΠΛΗΡΟΦΟΡΙΚΗΣ ΤΟΥ ΠΑΝΕΠΙΣΤΗΜΙΟΥ ΜΑΚΕ ΟΝΙΑΣ ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ ΣΧΟΛΗ ΗΛΕΚΤΡΟΛΟΓΩΝ ΜΗΧΑΝΙΚΩΝ ΚΑΙ ΜΗΧΑΝΙΚΩΝ ΥΠΟΛΟΓΙΣΤΩΝ

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ Δρ. ΣΩΤΗΡΙΟΣ Α. ΔΑΛΙΑΝΗΣ



40 3 Journal of South China University of Technology Vol. 40 No Natural Science Edition March

Research on Economics and Management

Development of the Nursing Program for Rehabilitation of Woman Diagnosed with Breast Cancer

Optimization, PSO) DE [1, 2, 3, 4] PSO [5, 6, 7, 8, 9, 10, 11] (P)

Πανεπιστήµιο Μακεδονίας Τµήµα ιεθνών Ευρωπαϊκών Σπουδών Πρόγραµµα Μεταπτυχιακών Σπουδών στις Ευρωπαϊκές Πολιτικές της Νεολαίας

Applying Markov Decision Processes to Role-playing Game

Secure Cyberspace: New Defense Capabilities

Ανάλυση σχημάτων βασισμένη σε μεθόδους αναζήτησης ομοιότητας υποακολουθιών (C589)

Indexing Methods for Encrypted Vector Databases

F0 Estimation of Melody and Bass Lines in Real-world Musical Audio Signals

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ ΠΜΣ «ΠΛΗΡΟΦΟΡΙΚΗ & ΕΠΙΚΟΙΝΩΝΙΕΣ» OSWINDS RESEARCH GROUP

Scrub Nurse Robot: SNR. C++ SNR Uppaal TA SNR SNR. Vain SNR. Uppaal TA. TA state Uppaal TA location. Uppaal

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

2 ~ 8 Hz Hz. Blondet 1 Trombetti 2-4 Symans 5. = - M p. M p. s 2 x p. s 2 x t x t. + C p. sx p. + K p. x p. C p. s 2. x tp x t.

Chapter 1 Introduction to Observational Studies Part 2 Cross-Sectional Selection Bias Adjustment

-,,.. Fosnot. Tobbins Tippins -, -.,, -,., -., -,, -,.

Ανάκτηση Πληροφορίας. Διδάσκων: Φοίβος Μυλωνάς. Διάλεξη #03

GPGPU. Grover. On Large Scale Simulation of Grover s Algorithm by Using GPGPU

Evolution of Novel Studies on Thermofluid Dynamics with Combustion

Web 論 文. Performance Evaluation and Renewal of Department s Official Web Site. Akira TAKAHASHI and Kenji KAMIMURA

ΑΝΩΤΑΤΟ ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΣΧΟΛΗ ΤΕΧΝΟΛΟΓΙΚΩΝ ΕΦΑΡΜΟΓΩΝ ΤΜΗΜΑ ΜΗΧΑΝΟΛΟΓΙΑΣ

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ ΗΜΕΡΟΜΗΝΙΑ ΓΕΝΝΗΣΗΣ : 1981 ΟΙΚΟΓΕΝΕΙΑΚΗ ΚΑΤΑΣΤΑΣΗ. : mkrinidi@gmail.com

[1] DNA ATM [2] c 2013 Information Processing Society of Japan. Gait motion descriptors. Osaka University 2. Drexel University a)

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ ΠΜΣ «ΠΛΗΡΟΦΟΡΙΚΗ & ΕΠΙΚΟΙΝΩΝΙΕς» OSWINDS RESEARCH GROUP

HIV HIV HIV HIV AIDS 3 :.1 /-,**1 +332

ΚΑΤΑΣΚΕΥΑΣΤΙΚΟΣ ΤΟΜΕΑΣ

ΚΒΑΝΤΙΚΟΙ ΥΠΟΛΟΓΙΣΤΕΣ

Transcript:

: 1 1 1 1 Singing Information Processing: Music Information Processing for Singing Voices Masataka Goto, 1 Takeshi Saitou, 1 Tomoyasu Nakano 1 and Hiromasa Fujihara 1 This paper introduces our research on a novel area of research referred to as singing information processing, which is music information research for singing voices. The concept of singing information processing systems is broad and still emerging, but this paper describes three important categories, singing understanding systems, music information retrieval systems based on singing voices, and singing synthesis systems. We first introduce singing understanding systems for synchronizing between vocal melody and corresponding lyrics, identifying the singer name, evaluating singing skills, creating hyperlinks between same phrases in the lyrics of songs, and detecting breath sounds. We then introduce music information retrieval systems based on similarity of vocal melody timbre and vocal percussion, as well as singing synthesis systems for speech-to-singing synthesis and singing-to-singing synthesis. We expect such a wide variety of research related to singing information processing to progress rapidly in the years to come. 1. 1) 4) 5) 6) 7),8) Web VOCALOID 9) YouTube 1 National Institute of Advanced Industrial Science and Technology (AIST) 1 c 2010 Information Processing Society of Japan

10),11) CD 2010 5 31 1 2. 2.1 2.1.1 LyricSynchronizer: LyricSynchronizer ( 1) 12),13) PreFEst 14) Viterbi 12),13) 1 LyricSynchronizer: 12),13) 15) 16) 17) 18) MIDI 19) 2.1.2 Singer ID: 20) 22) LyricSynchronizer 20) 22) (GMM) 23) 27) 2 c 2010 Information Processing Society of Japan

2 MiruSinger: 31),32) 28),29) 30) 2.1.3 MiruSinger: MiruSinger ( 2) 31),32) 33),34) PreFEst 14) MiruSinger 35) 36) 38) 39) 2.1.4 Hyperlinking Lyrics: Hyperlinking Lyrics 40),41) PreFEst 14) (HMM) 40),41) 2.1.5 Breath Detection: 42),43) MFCC ΔMFCC Δpower HMM 1.6kHz 1.7kHz 42),43) 44) MFCC 2.2 3 c 2010 Information Processing Society of Japan

3 VocalFinder: 22),52),53) 4 Voice Drummer: 54),55) 2) (QBH: Query-by-Humming) 45) 51) VocalFinder Voice Drummer 2.2.1 VocalFinder: VocalFinder ( 3) 22),52),53) PreFEst 14) 22),52),53) 2) 2.2.2 Voice Drummer: Voice Drummer ( 4) 54),55). HMM 54),55) 56),57) 4 c 2010 Information Processing Society of Japan

5 SingBySpeaking: 65),66) 6 VocaListener: 71),72) (Beatboxing) 58) 60) 2.3 9) (text-to-singing synthesis) 61) 64) (speech-to-singing synthesis) (singing-to-singing synthesis) 2.3.1 SingBySpeaking: SingBySpeaking ( 5) 65),66) http://www.interspeech2007.org/technical/synthesis of singing challenge.php STRAIGHT 67) (F0) F0 F0 F0 65),66) (singing-to-speech synthesis) SpeakBySinging 68) 69) 70) 2.3.2 VocaListener: VocaListener ( 6) 71),72) http://staff.aist.go.jp/t.nakano/vocalistener/ 5 c 2010 Information Processing Society of Japan

71),72) 73) 2008 VocaListener 2010 VocaListener2 74) 3. 2 3.1 PreFEst 14) PreFEst GMM LyricSynchronizer Singer ID MiruSinger Hyperlinking Lyrics VocalFinder 75) 3.2 LyricSynchronizer Hyperlinking Lyrics Voice Drummer SingBySpeaking VocaListener LyricSynchronizer Hyperlinking Lyrics PreFEst 2 (HMM) VAD (HMM) Viterbi VocaListener 12),13),71) 40),41),72) SingBySpeaking 65),66) Voice Drummer HMM 54),55) 3.3 Singer ID VocalFinder PreFEst Singer ID (LPMCC) GMM 20) 22) VocalFinder LPMCC ΔF0 GMM 22),52),53) 3.2 2 (HMM) 12),13),22),52),53) 3.4 F0 SingBySpeaking F0 F0 F0 4 F0 76) F0 F0 VocalFinder F0 ΔF0 GMM F0 22),52),53) 6 c 2010 Information Processing Society of Japan

F0 VocaListener 71),72) MiruSinger 31),32) 4. 77) 78) 79) CrestMuse 1) Vol.60, No.11, pp.675 681 (2004). 2) Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C. and Slaney, M.: Content-Based Music Information Retrieval: Current Directions and Future Challenges, Proceedings of the IEEE, Vol.96, No.4, pp.668 696 (2008). 3) ( ) Vol.50, No.8, pp. 709 772 (2009). 4) ( ) Vol.51, No.6, pp.661 668 (2010). 5) Vol.64, No.10, pp.616 623 (2008). 6) Goto, M., Saitou, T., Nakano, T. and Fujihara, H.: Singing Information Processing Based on Singing Voice Modeling, Proc. of ICASSP 2010, pp.5506 5509 (2010). 7), 86 (2010). 8) Vol.130, No.6, pp.360 363 (2010). 9) ( ) Vol. 50, No. 8, pp. 723 728 (2010). 10) Cabinet Office, Government of Japan: Virtual Idol, Highlighting JAPAN through images, Vol. 2, No. 11, pp. 24 25 (2009). http://www.govonline.go.jp/pdf/hlj img/vol 0020et/24-25.pdf. 11) : Vol.25, No.1, pp.157 167 (2010). 12) Fujihara, H., Goto, M., Ogata, J., Komatani, K., Ogata, T. and Okuno, H. G.: Automatic Synchronization Between Lyrics and Music CD Recordings Based on Viterbi Alignment of Segregated Vocal Signals, Proc. of ISM 2006, pp. 257 264 (2006). 13) Fujihara, H. and Goto, M.: Three Techniques for Improving Automatic Synchronization Between Music and Lyrics: Fricative Detection, Filler Model, and Novel Feature Vectors for Vocal Activity Detection, Proc. of ICASSP 2008 (2008). 14) Goto, M.: A Real-time Music Scene Description System: Predominant-F0 Estimation for Detecting Melody and Bass Lines in Real-world Audio Signals, Speech Communication, Vol.43, No.4, pp.311 329 (2004). 15) Chen, K., Gao, S., Zhu, Y. and Sun, Q.: Popular Song and Lyrics Synchronization and Its Application to Music Information Retrieval, Proc. of MMCN 06 (2006). 16) Iskandar, D., Wang, Y., Kan, M.-Y. and Li, H.: Syllabic Level Automatic Synchronization of Music Signals and Text Lyrics, Proc. of ACM Multimedia 2006, pp. 659 662 (2006). 17) Wong, C.H., Szeto, W.M. and Wong, K.H.: Automatic Lyrics Alignment for Cantonese Popular Music, Multimedia Systems, Vol.4-5, No.12, pp.307 323 (2007). 18) Kan, M.-Y., Wang, Y., Iskandar, D., Nwe, T.L. and Shenoy, A.: LyricAlly: Automatic Synchronization of Textual Lyrics to Acoustic Music Signals, IEEE Trans. on Audio, Speech, and Language Processing, Vol.16, No.2, pp.338 349 (2008). 19) Müller, M., Kurth, F., Damm, D., Fremerey, C. and Clausen, M.: Lyrics-based Audio Retrieval and Multimodal Navigation in Music Collections, Proc.ofECDL 2007, pp.112 123 (2007). 20) Fujihara, H., Kitahara, T., Goto, M., Komatani, K., Ogata, T. and Okuno, H.G.: Singer Identification Based on Accompaniment Sound Reduction and Reliable Frame Selection, Proc. of ISMIR 2005, pp.329 336 (2005). 7 c 2010 Information Processing Society of Japan

21) Vol.47, No.6, pp.1831 1843 (2006). 22) Fujihara, H., Goto, M., Kitahara, T. and Okuno, H. G.: A Modeling of Singing Voice Robust to Accompaniment Sounds and Its Application to Singer Identification and Vocal-Timbre-Similarity-Based Music Information Retrieval, IEEE Trans. on Audio, Speech, and Language Processing, Vol.18, No.3, pp.638 648 (2010). 23) Berenzweig, A.L., Ellis, D. P.W. and Lawrence, S.: Using Voice Segments to Improve Artist Classification of Music, AES 22nd International Conference on Virtual, Synthetic, and Entertainment Audio (2002). 24) Kim, Y. E. and Whitman, B.: Singer Identificatin in Popular Music Recordings Using Voice Coding Features, Proc. of ISMIR 2002, pp.164 169 (2002). 25) Zhang, T.: Automatic Singer Identification, Proc. of ICME 2003, Vol.I, pp.33 36 (2003). 26) Maddage, N.C., Xu, C. and Wang, Y.: Singer Identification Based on Vocal and Instrumental Models, Proc.ofICPR 04, Vol.2, pp.375 378 (2004). 27) Shen, J., Cui, B., Shepherd, J. and Tan, K.-L.: Towards Efficient Automated Singer Identification in Large Music Databases, Proc. of SIGIR 06, pp.59 66 (2006). 28) Bartsch, M.A.: Automatic Singer Identification in Polyphonic Music, PhD Thesis, The University of Michigan (2004). 29) Tsai, W.-H. and Wang, H.-M.: Automatic Singer Recognition of Popular Music Recordings via Estimation and Modeling of Solo Vocal Signals, IEEE Trans. on Audio, Speech, and Language Processing, Vol.14, No.1, pp.330 341 (2006). 30) Nwe, T.L. and Li, H.: Exploring Vibrato-Motivated Acoustic Features for Singer Identification, IEEE Trans. on Audio, Speech, and Language Processing, Vol. 15, No.2, pp.519 530 (2007). 31) Nakano, T., Goto, M. and Hiraga, Y.: MiruSinger: A Singing Skill Visualization Interface Using Real-Time Feedback and Music CD Recordings as Referential Data, Proc. of ISM 2007 Workshops (Demonstrations), pp.75 76 (2007). 32) MiruSinger: / / 2007 pp.195 196 (2007). 33) Nakano, T., Goto, M. and Hiraga, Y.: An Automatic Singing Skill Evaluation Method for Unknown Melodies Using Pitch Interval Accuracy and Vibrato Features, Proc. of Interspeech 2006, pp.1706 1709 (2006). 34) Vol.48, No.1, pp.227 236 (2007). 35) Prasert, P., Iwano, K. and Furui, S.: An Automatic Singing Voice Evaluation Method for Voice Training Systems, 2-5-12 (2008). 36) Howard, D.M. and Welch, G.F.: Microcomputer-based Singing Ability Assessment and Development, Applied Acoustics, Vol.27, pp.89 102 (1989). 37) Vol.J84-D-II, No.9, pp.1933 1941 (2001). 38) Hoppe, D., Sadakata, M. and Desain, P.: Development of Real-Time Visual Feedback Assistance in Singing Training: a Review, Journal of Computer Assisted Learning, Vol.22, pp.308 316 (2006). 39) 2010-MUS-86 (2010). 40) Hyperlinking Lyrics: 2008-MUS-76-10 pp.51 56 (2008). 41) Fujihara, H., Goto, M. and Ogata, J.: Hyperlinking Lyrics: A Method for Creating Hyperlinks Between Phrases in Song Lyrics, Proc. of ISMIR 2008, pp.281 286 (2008). 42) 2008-MUS-76-15 pp.83 88 (2008). 43) Nakano, T., Ogata, J., Goto, M. and Hiraga, Y.: Analysis and Automatic Detection of Breath Sounds in Unaccompanied Singing Voice, Proc. of ICMPC 2008, pp. 387 390 (2008). 44) Ruinskiy, D. and Lavner, Y.: An Effective Algorithm for Automatic Detection and Exact Demarcation of Breath Sounds in Speech and Song Signals, IEEE Trans. on Audio, Speech, and Language Processing, Vol.15, No.3, pp.838 850 (2007). 45) Kageyama, T., Mochizuki, K. and Takashima, Y.: Melody Retrieval with Humming, Proc. of ICMC 1993, pp.349 351 (1993). 46) Ghias, A., Logan, J., Chamberlin, D. and Smith, B.: Query by Humming: Musical Information Retrieval in an Audio Database, Proc. of ACM Multimedia 1995, Vol.95, pp.231 236 (1995). 47) Sonoda, T., Goto, M. and Muraoka, Y.: A WWW-based Melody Retrieval System, Proc. of ICMC 1998, pp.349 352 (1998). 48) Dannenberg, R., Birmingham, W., Pardo, B., Meek, C., Hu, N. and Tzanetakis, G.: A Comparative Evaluation of Search Techniques for Query-By-Humming Using the MUSART Testbed, Journal of the American Society for Information Science and Technology, Vol.58, No.5, pp.687 701 (2007). 49) Suzuki, M., Hosoya, T., Ito, A. and Makino, S.: Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information, EURASIP Journal on Ad- 8 c 2010 Information Processing Society of Japan

vances in Signal Processing, Vol.2007 (2007). 50) Unal, E., Chew, E., Georgiou, P. and Narayanan, S.: Challenging Uncertainty in Query by Humming Systems: A Fingerprinting Approach, IEEE Trans. on Audio, Speech, and Language Processing, Vol.16, No.2, pp.359 371 (2008). 51) : Vol.49, No.11, pp.3789 3797 (2008). 52) Fujihara, H. and Goto, M.: A Music Information Retrieval System Based on Singing Voice Timbre, Proc. of ISMIR 2007, pp.467 470 (2007). 53) VocalFinder: 2007-MUS-71-5 pp.27 32 (2007). 54) Nakano, T., Goto, M., Ogata, J. and Hiraga, Y.: Voice Drummer: A Music Notation Interface of Drum Sounds Using Voice Percussion Input, Proc. of UIST 2005 (Demos), pp.49 50 (2005). 55) Vol.48, No.1, pp.386 397 (2007). 56) Gillet, O. and Richard, G.: Drum loops retrieval from spoken queries, Journal of Intelligent Information Systems, Vol.24, No.2 3, pp.159 177 (2005). 57) Gillet, O. and Richard, G.: Indexing and Querying Drum Loops Databases, Proc. of CBMI 2005 (2005). 58) Kapur, A., Benning, M. and Tzanetakis, G.: Query-by-Beat-Boxing: Music Retrieval for the DJ, Proc. of ISMIR 2004, pp.170 177 (2004). 59) Hazan, A.: Towards Automatic Transcription of Expressive Oral Percussive Performances, Proc. of IUI 2005, pp.296 298 (2005). 60) Sinyor, E., McKay, C., Fiebrink, R., McEnnis, D. and Fujinaga, I.: Beatbox Classification Using ACE, Proc. of ISMIR 2005, pp.672 675 (2005). 61) Bonada, J. and Serra, X.: Synthesis of the Singing Voice by Performance Sampling and Spectral Models, IEEE Signal Processing Magazine, Vol.24, No.2, pp. 67 79 (2007). 62) Saino, K., Zen, H., Nankaku, Y., Lee, A. and Tokuda, K.: An HMM-based Singing Voice Synthesis System, Proc. of Interspeech 2006, pp.1141 1144 (2006). 63) VOCALOID 2007- MUS-72 pp.25 28 (2007). 64) Sinsy: HMM 2010-MUS-86 (2010). 65) Saitou, T., Goto, M., Unoki, M. and Akagi, M.: Speech-to-Singing Synthesis: Converting Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices, Proc. of WASPAA 2007, pp.215 218 (2007). 66) SingBySpeaking: 2008-MUS-74-5 pp.25 32 (2008). 67) Kawahara, H., Masuda-Kasuse, I. and de Cheveigne, A.: Restructuring Speech Representations Using a Pitch-Adaptive Time-Frequency Smoothing and an Instantaneous-Frequency-Based F0 Extraction: Possible Role of a Repetitive Structure in Sounds, Speech Communication, Vol.27, No.3 4, pp.187 207 (1999). 68) SpeakBySinging: 2010-MUS-86 (2010). 69) Vol.47, No.6, pp.1822 1830 (2006). 70) 2010-MUS-86 (2010). 71) VocaListener: 2008-MUS-75-9 pp.49 56 (2008). 72) Nakano, T. and Goto, M.: VocaListener: A Singing-to-Singing Synthesis System Based on Iterative Parameter Estimation, Proc. of SMC 2009, pp.343 348 (2009). 73) Janer, J., Bonada, J. and Blaauw, M.: Performance-Driven Control for Sample- Based Singing Voice Synthesis, Proc.ofDAFx-06, pp.41 44 (2006). 74) VocaListener2: 2010-MUS-86 (2010). 75) Fujihara, H., Kitahara, T., Goto, M., Komatani, K., Ogata, T. and Okuno, H.G.: F0 Estimation Method for Singing Voice in Polyphonic Audio Signal Based on Statistical Vocal Model and Viterbi Search, Proc. of ICASSP 2006, pp.v 253 256 (2006). 76) Saitou, T., Unoki, M. and Akagi, M.: Development of an F0 Control Model Based on F0 Dynamic Characteristics for Singing-Voice Synthesis, Speech Communication, Vol.46, No.3 4, pp.405 417 (2005). 77) Deutsch, D.(ed.): The Psychology of Music, Academic Press (1982). 78) Titze, I.R.: Principles of Voice Production, The National Center for Voice and Speech (2000). 79) (1987). 9 c 2010 Information Processing Society of Japan