How to design estimators tting with accuracy measures Theories and applications in bioinformatics

Σχετικά έγγραφα
A sequence alignment algorithm using the transition quantity

BMI/CS 776 Lecture #14: Multiple Alignment - MUSCLE. Colin Dewey

th International Conference on Machine Learning and Applications. E d. h. U h h b w k. b b f d h b f. h w k by v y

1530 ( ) 2014,54(12),, E (, 1, X ) [4],,, α, T α, β,, T β, c, P(T β 1 T α,α, β,c) 1 1,,X X F, X E F X E X F X F E X E 1 [1-2] , 2 : X X 1 X 2 ;

Μηχανική μάθηση (Machine Learning)

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

Research on vehicle routing problem with stochastic demand and PSO2DP algorithm with Inver2over operator

Stabilization of stock price prediction by cross entropy optimization

ΒΙΟΠΛΗΡΟΦΟΡΙΚΗ ΙΙ. Δυναμικός Προγραμματισμός. Παντελής Μπάγκος

Buried Markov Model Pairwise

Medium Data on Big Data


Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb

{takasu, Conditional Random Field

Οι απόψεις και τα συμπεράσματα που περιέχονται σε αυτό το έγγραφο, εκφράζουν τον συγγραφέα και δεν πρέπει να ερμηνευτεί ότι αντιπροσωπεύουν τις

Anomaly Detection with Neighborhood Preservation Principle

ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΒΙΟΛΟΓΙΚΩΝ ΕΠΙΣΤΗΜΩΝ

Optimization, PSO) DE [1, 2, 3, 4] PSO [5, 6, 7, 8, 9, 10, 11] (P)

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

Βιοπληροφορική. Ενότητα 21: Υπολογιστικός Προσδιορισμός Δομής (3/3), 1 ΔΩ. Τμήμα: Βιοτεχνολογίας Όνομα καθηγητή: Τ. Θηραίου

A Systematic Review of Procalcitonin for Early Detection of Septicemia of Newborn

Molecular evolutionary dynamics of respiratory syncytial virus group A in

Quick algorithm f or computing core attribute

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

Estimation, Evaluation and Guarantee of the Reverberant Speech Recognition Performance based on Room Acoustic Parameters

Present and Future Prospects of Protein Structure Prediction

Βιοπληροφορική. Ενότητα 14: Μοντέλα Πολλαπλής Στοίχισης (2/2), 1.5ΔΩ. Τμήμα: Βιοτεχνολογίας Όνομα καθηγητή: Τ. Θηραίου


ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΒΙΟΛΟΓΙΚΩΝ ΕΠΙΣΤΗΜΩΝ

ΚΩΝΣΤΑΝΤΙΝΟΣ Σ. ΠΟΛΙΤΗΣ Διπλ. Φυσικός Πανεπιστημίου Πατρών Υποψήφιος Διδάκτωρ Ε.Μ.Π. ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ

Βιογραφικό Σημείωμα. Διεύθυνση επικοινωνίας: Τμήμα Μαθηματικών, Πανεπιστήμιο Πατρών

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

Chalkou I. C. [PROJECT] Ανάθεση εργασιών.

Chapter 1 Introduction to Observational Studies Part 2 Cross-Sectional Selection Bias Adjustment

Κατηγοριοποίηση. 3 ο Φροντιστήριο. Ε Ξ Ό Ρ Υ Ξ Η Δ Ε Δ Ο Μ Έ Ν Ω Ν Κ Α Ι Α Λ Γ Ό Ρ Ι Θ Μ Ο Ι Μ Ά Θ Η Σ Η ς. Σκούρα Αγγελική

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ ΠΡΟΣΩΠΙΚΑ ΣΤΟΙΧΕΙΑ ΣΠΟΥΔΕΣ

[15], [16], [17] [6] [2] [5] Jiang [6] 2.1 [6], [10] Score(x, y) y ( 1) ( 1 ) b e ( 1 ) b e. O(n 2 ) Jiang [6] (word lattice reranking)

Δομές Δεδομένων. Δημήτρης Μιχαήλ. Συμβολοσειρές. Τμήμα Πληροφορικής και Τηλεματικής Χαροκόπειο Πανεπιστήμιο

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ

ΣΥΝΔΥΑΣΤΙΚΗ ΒΕΛΤΙΣΤΟΠΟΙΗΣΗ

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ. Παναγιώτης Μερκούρης ΕΚΠΑΙΔΕΥΣΗ

ER-Tree (Extended R*-Tree)

ΓΙΑΝΝΟΥΛΑ Σ. ΦΛΩΡΟΥ Ι ΑΚΤΟΡΑΣ ΤΟΥ ΤΜΗΜΑΤΟΣ ΕΦΑΡΜΟΣΜΕΝΗΣ ΠΛΗΡΟΦΟΡΙΚΗΣ ΤΟΥ ΠΑΝΕΠΙΣΤΗΜΙΟΥ ΜΑΚΕ ΟΝΙΑΣ ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ

ΜΕΘΟΔΟΙ ΥΠΟΛΟΓΙΣΜΟΥ ΤΗΣ ΖΕΝΙΘΕΙΑΣ ΤΡΟΠΟΣΦΑΙΡΙΚΗΣ ΥΣΤΕΡΗΣΗΣ ΣΕ ΜΟΝΙΜΟΥΣ ΣΤΑΘΜΟΥΣ GNSS

CorV CVAC. CorV TU317. 1

Area Location and Recognition of Video Text Based on Depth Learning Method

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ ΛΕΩΝΙΔΑΣ Α. ΣΠΥΡΟΥ Διδακτορικό σε Υπολογιστική Εμβιομηχανική, Τμήμα Μηχανολόγων Μηχανικών, Πανεπιστήμιο Θεσσαλίας.

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

ΕΓΓΡΑΦΟ ΕΡΓΑΣΙΑΣ ΤΩΝ ΥΠΗΡΕΣΙΩΝ ΤΗΣ ΕΠΙΤΡΟΠΗΣ. Εκθεση χώρας - Κύπρος {COM(2015) 85 final}

Newman Modularity Newman [4], [5] Newman Q Q Q greedy algorithm[6] Newman Newman Q 1 Tabu Search[7] Newman Newman Newman Q Newman 1 2 Newman 3

Gemini, FastMap, Applications. Εαρινό Εξάμηνο Τμήμα Μηχανικών Η/Υ και Πληροϕορικής Πολυτεχνική Σχολή, Πανεπιστήμιο Πατρών

Bayesian Discriminant Feature Selection

Supplementary Materials for Evolutionary Multiobjective Optimization Based Multimodal Optimization: Fitness Landscape Approximation and Peak Detection


FORMULAS FOR STATISTICS 1

Ανάλυση σχημάτων βασισμένη σε μεθόδους αναζήτησης ομοιότητας υποακολουθιών (C589)

Αποθήκες εδοµένων και Εξόρυξη Γνώσης (Data Warehousing & Data Mining)

Βιοπληροφορική. Μαργαρίτα Θεοδωροπούλου. Πανεπιστήμιο Θεσσαλίας, Λαμία 2016

Parameter Estimation of Stochastic Grammars with Probabilistic Logic Programs

ΠΡΟΤΕΙΝΟΜΕΝΑ ΘΕΜΑΤΑ ΔΙΠΛΩΜΑΤΙΚΩΝ ΕΡΓΑΣΙΩΝ ΓΙΑ ΤΟ ΕΑΡΙΝΟ ΕΞΑΜΗΝΟ Εισηγητής: Νίκος Πλόσκας Επίκουρος Καθηγητής ΤΜΠΤ

Analysis of Protein Structure in Silico

Περιεχόµενα. 1. Γενικό πλαίσιο. 2. Η ΚΑΠ σήµερα. 3. Γιατί χρειαζόµαστε τη µεταρρύθµιση; 4. Νέοι στόχοι, µελλοντικά εργαλεία και πολιτικές επιλογές

Bayesian modeling of inseparable space-time variation in disease risk

CSJ. Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity

Δημήτριος Θ. Τόμτσης, Ph.D. Αναλυτικό Βιογραφικό Σημείωμα

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

High order interpolation function for surface contact problem

substructure similarity search using features in graph databases

Research on model of early2warning of enterprise crisis based on entropy

794 Appendix A:Tables

90 [, ] p Panel nested error structure) : Lagrange-multiple LM) Honda [3] LM ; King Wu, Baltagi, Chang Li [4] Moulton Randolph ANOVA) F p Panel,, p Z

Στοίχιση Ακολουθιών. Μέθοδοι σύγκρισης ακολουθιών. Είδος στοίχισης. match. gap. mismatch

Βιοπληροφορική. Εισαγωγή. Αλέξανδρος Τζάλλας Σχολή Τεχνολογικών Εφαρμογών Τμήμα Μηχανικών Πληροφορικής ΤΕ.

Βιοπληροφορική. Ενότητα 7: Στοίχιση ακολουθιών ανά ζεύγη Τεχνικές Στοίχισης Ακολουθιών, (1/2) 1ΔΩ. Τμήμα: Βιοτεχνολογίας Όνομα καθηγητή: Τ.

Βιοπληροφορική. Ενότητα 13: Μοντέλα Πολλαπλής Στοίχισης (1/2), 1.5ΔΩ. Τμήμα: Βιοτεχνολογίας Όνομα καθηγητή: Τ. Θηραίου

1 n-gram n-gram n-gram [11], [15] n-best [16] n-gram. n-gram. 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e)

[2] T.S.G. Peiris and R.O. Thattil, An Alternative Model to Estimate Solar Radiation

HOSVD. Higher Order Data Classification Method with Autocorrelation Matrix Correcting on HOSVD. Junichi MORIGAKI and Kaoru KATAYAMA

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

Simplex Crossover for Real-coded Genetic Algolithms

Coupling of a Jet-Slot Oscillator With the Flow-Supply Duct: Flow-Acoustic Interaction Modeling

Tutorial on Multinomial Logistic Regression

Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

Βιοπληροφορική. Ενότητα 14: Μοντέλα Πολλαπλής Στοίχισης (2/2), 1.5ΔΩ. Τμήμα: Βιοτεχνολογίας Όνομα καθηγητή: Τ. Θηραίου

Βιοπληροφορική. Ενότητα 7: Στοίχιση ακολουθιών ανά ζεύγη Τεχνικές Στοίχισης Ακολουθιών,(2/2) 2 ΔΩ. Τμήμα: Βιοτεχνολογίας Όνομα καθηγητή: Τ.

Matrices and vectors. Matrix and vector. a 11 a 12 a 1n a 21 a 22 a 2n A = b 1 b 2. b m. R m n, b = = ( a ij. a m1 a m2 a mn. def

Biostatistics for Health Sciences Review Sheet

(1) (2) MFA/SFA: material flow analysis/ substance flow analysis MFA/SFA

Applying Markov Decision Processes to Role-playing Game

Toward a SPARQL Query Execution Mechanism using Dynamic Mapping Adaptation -A Preliminary Report- Takuya Adachi 1 Naoki Fukuta 2.


Math 6 SL Probability Distributions Practice Test Mark Scheme

A method for identifying TSS from CAGE data using a Genomic Signal Processing approach

Chinese Journal of Applied Probability and Statistics Vol.28 No.3 Jun (,, ) 应用概率统计 版权所用,,, EM,,. :,,, P-. : O (count data)

Εφαρμοσμένη Βιοτεχνολογία Σημειώσεις. Νίκος Τσουκιάς Σχολή Χημικών Μηχανικών ΕΜΠ

: Active Learning 2017/11/12

Transcript:

How to design estimators tting with accuracy measures Theories and applications in bioinformatics Michiaki HAMADA Toshiyuki SATO (): 1, Miyazawa 22) HMM 4) RNA 2 Miyazawa ( ) 22) 2 ( ) 1 1 D, Y () Y p(y D) y Y 1 (decoding) p(y D) ( ) 1( ) θ Y y Y G : Y Y R +, G(θ, y) (gain function). 2(MEG) 1 (MEG ) ŷ (MEG) =argmax G(θ, y)p(θ D)dθ y Y Θ 2010 15 1

MEG (Maximum Expected Accuracy Estimator; MEA ) (loss function) MEA ( ) 3 γ 1 Y Y {0, 1} n MEG γ RNA 2 Y {0, 1} n x x y {0, 1} x x : x i x k y ik =1 y ik =0 y Y {0, 1} n y 1 0 ( ) ( ) y Y θ Y,, TP(θ, y), TN(θ, y), FP(θ, y), FN(θ, y) (TP TN) (FP FN) G(θ, y) =α 1 TP(θ, y)+α 2 TN(θ, y) α 3 FP(θ, y) α 4 FN(θ, y). (1) α k (k =1, 2, 3, 4). Seisitivity (SEN), Positive Predictive Value (PPV), Matthews correlation coef cient (MCC) F-score ( TP, TN, FP, FN ; 1) ) MEG 3(γ-centroid estimator) γ 0 γ G(θ, y) =γtp(θ, y)+tp(θ, y) (2) MEG γ =1 γ 2) γ (1) MEG 1 (1) MEG γ = α1+α4 α 2+α 3 γ γ SEN PPV γ 2 Y y = {y i } Y y = {y i } Y i y i {y i, 0}. γ 1/(γ +1) p i = θ Y I(θ i =1)p(θ D) p i () RNA 2 (2 ) i {p i } i 21). 2 2 2 γ p i 1/(γ +1) Y Y 0 γ 1 γ 1/(γ +1) 1 0 ( Y ) 9) 1( γ ) γ 1/(γ +1) ( 2) γ [0, 1] γ 1/(γ +1) 16 2

γ >1 γ Needleman-Wunsch 24) M i 1,k 1 +(γ +1)p ik 1 M i,k =max M i 1,k. M i,k 1 M i,k x 1 x i x 1 x k 2(2 γ ) γ 2 1/(γ +1) ( 2) γ [0, 1] γ 1/(γ +1) γ >1 γ Nussinov 25) M i,j =max M i+1,j M i,j 1 M i+1,j 1 +(γ +1)p ij 1 max k [M i,k + M k+1,j ] M i,j x i x i+1 x j 2 γ ( ) γ S ( ) S 2 n 1 n 1 (n S ) 2 Hamming 26) 1-centroid γ>1 γ 4 MEG (1) SEN, PPV, F-score, MCC. MEG RNA 2 MCC/F-score 12) Hamada (pseudo-expected accuracy) 12) TP, TN, FP, FN ( MCC F-score; 1) ) Acc = f(tp, TN, FP, FN) y Âcc 0 (y) =f( TP, TN, FP, FN). X X(=TP,FP,TN,FN) {p i } ( RNA 2 ) 2 MCC F-score MCC F-score 12) γ γ 2 2 12) ( MCC F-score ) SEN PPV 5 γ x, x x x z x, x x, x, z γ 13, 14) () 4, 28) (Probabilistic consistency transformation; PCT) 9). 17 3

6.3 γ (CentroidAlign) 13) RNA 2 γ (CentroidHomfod) 14) Kato RNA-RNA γ 17) (RactIP) RNA- RNA 2 RNA 2 6 3 γ 2 NP γ ( 2) γ 1 Do 7) 6.1 RNA 2 Kall HMM 2 RNA 2 RNA 5). MEG 16) γ SEN, PPV 7 10). γ ( 1) γ 6.2 γ Schwartz AMA (Alignment Metric Accuracy) AMA 29) AMA γ 6) SEN PPV RNA 2 AMA SEN, PPV γ γ ( SPS ) SEN, PPV, MCC, F-score ( 1) 5, 18, 20, 30) 8, 31) γ 6, 32). NEDO 18 4

1 Holmes & Durbin 15) -centroid a SPS b Miyazawa 22) 1-centroid Schwartz et al. 29) AMA c AMA Do et al. 4) ProbCons -centroid ( ) d SPS Roshan et al. 27) ProbAlign -centroid ( ) SPS Sahraeian et al. 28) PicXAA -centroid ( ) SPS Frith et al. 6) LAST γ-centroid SEN, PPV Hamada et al. 10) CentroidFold RNA2 γ-centroid SEN, PPV Hamada et al. 12) CentroidFold RNA2 MCC/F-score e MCC, F-score Do et al. 5) CONTRAfold RNA2 f Lu et al. 20) MaxExpect RNA2 Ding et al. 3) Sfold RNA2 1-centroid g Hamada et al. 14) CentroidHomfold RNA2 h γ-centroid ( ) SEN, PPV Hamada et al. 11) CentroidAlifold RNA 2 γ-centroid SEN, PPV Seemann et al. 30) PETfold RNA 2 Knudsen & Hein 19) Pfold RNA 2 Kiryu et al. 18) McCaskill-MEA RNA 2 Hamada et al. 13) CentroidAlign RNA γ-centroid ( ) SEN, PPV Tabei et al. 32) SCARNA-LM RNA γ-centroid SEN, PPV Kato et al. 17) RactIP RNA-RNA γ-centroid SEN, PPV Kall et al. 16) i Do et al. 7) CONTRAST Michal et al. 23) HIV a γ γ ((2) ); b Sum-of-pairs score; c Alignment metric accuracy; d γ (5 ); e γ γ 2 ; f RNA ( ) SEN PPV ; g 2 ; h 2 ; i RNA CBRC 1) P. Baldi, S. Brunak, Y. Chauvin, C. A. Andersen, and H. Nielsen. Assessing the accuracy of prediction algorithms for classi cation: an overview. Bioinformatics, 16:412 424, May 2000. 2) L. Carvalho and C. Lawrence. Centroid estimation in discrete high-dimensional spaces with applications in biology. Proc. Natl. Acad. Sci. U.S.A., 105:3209 3214, 2008. 3) Y. Ding, C. Chan, and C. Lawrence. RNA secondary structure prediction by centroids in a Boltzmann weighted ensemble. RNA, 11:1157 1166, Aug 2005. 4) C. Do, M. Mahabhashyam, M. Brudno, and S. Batzoglou. ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res., 15:330 340, Feb 2005. 5) C. Do, D. Woods, and S. Batzoglou. CONTRAfold: RNA secondary structure prediction without physicsbased models. Bioinformatics, 22:e90 98, Jul 2006. 6) M. C. Frith, M. Hamada, and P. Horton. Parameters for accurate genome alignment. BMC Bioinformatics, 11:80, Feb 2010. 7) S. Gross, C. Do, M. Sirota, and S. Batzoglou. CON- TRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol., 8:R269, 2007. 8) S. S. Gross, O. Russakovsky, C. B. Do, and S. Batzoglou. Training conditional random elds for maximum labelwise accuracy. In B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Infor- 19 5

mation Processing Systems 19, pages 529 536. MIT Press, Cambridge, MA, 2007. 9) M. Hamada, H. Kiryu, W. Iwasaki, and K. Asai. Generalized Centroid Estimators in Bioinformatics. PLoS ONE 6(2): e16450, 2011. 10) M. Hamada, H. Kiryu, K. Sato, T. Mituyama, and K. Asai. Prediction of RNA secondary structure using generalized centroid estimators. Bioinformatics, 25:465 473, 2009. 11) M. Hamada, K. Sato, and K. Asai. Improving the accuracy of predicting secondary structure for aligned RNA sequences. Nucleic Acids Res., 2010 (in press). 12) M. Hamada, K. Sato, and K. Asai. Prediction of RNA secondary structure by maximizing pseudo-expected accuracy. BMC Bioinformatics, 11:586, 2010. 13) M. Hamada, K. Sato, H. Kiryu, T. Mituyama, and K. Asai. CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-ofpairs score. Bioinformatics, 25:3236 3243, 2009. 14) M. Hamada, K. Sato, H. Kiryu, T. Mituyama, and K. Asai. Predictions of RNA secondary structure by combining homologous sequence information. Bioinformatics, 25:i330 338, 2009. 15) I. Holmes and R. Durbin. Dynamic programming alignment accuracy. J. Comput. Biol., 5:493 504, 1998. 16) L. Kall, A. Krogh, and E. L. Sonnhammer. An HMM posterior decoder for sequence feature prediction that includes homology information. Bioinformatics, 21 Suppl 1:i251 257, 2005. 17) Y. Kato, K. Sato, M. Hamada, Y. Watanabe, K. Asai, and T. Akutsu. RactIP: fast accurate prediction of RNA-RNA interaction using integer programming. Bioinformatics, 2010 (in press). 18) H. Kiryu, T. Kin, and K. Asai. Robust prediction of consensus secondary structures using averaged base pairing probability matrices. Bioinformatics, 23:434 441, 2007. 19) B. Knudsen and J. Hein. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res, 31(13):3423 3428, Jul 2003. 20) Z. J. Lu, J. W. Gloor, and D. H. Mathews. Improved RNA secondary structure prediction by maximizing expected pair accuracy. RNA, 15:1805 1813, Oct 2009. 21) J. S. McCaskill. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers, 29(6-7):1105 1119, May 1990. 22) S. Miyazawa. A reliable sequence alignment method based on probabilities of residue correspondences. Protein Eng., 8:999 1009, Oct 1995. 23) M. Nánási, T. Vinar, and B. Brejová. The Highest Expected Reward Decoding for HMMs with Application to Recombination Detection. CoRR, abs/1001.4499, 2010. 24) S. Needleman and C. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 48:443 453, Mar 1970. 25) R. Nussinov, G. Pieczenk, J. Griggs, and D. Kleitman. Algorithms for loop matchings. SIAM Journal of Applied Mathematics, 35:68 82, 1978. 26) D. F. Robinson and L. R. Foulds. Comparison of phylogenetic trees. Mathematical Biosciences, 53(1-2):131 147, February 1981. 27) U. Roshan and D. Livesay. Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics, 22:2715 2721, Nov 2006. 28) S. M. Sahraeian and B. J. Yoon. PicXAA: greedy probabilistic construction of maximum expected accuracy alignment of multiple sequences. Nucleic Acids Res., 38:4917 4928, Aug 2010. 29) A. S. Schwartz, E. W. Myers, and L. Pachter. Alignment metric accuracy, 2005. http://arxiv.org:qbio/0510052. 30) S. Seemann, J. Gorodkin, and R. Backofen. Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments. Nucleic Acids Res., 36:6355 6362, 2008. 31) J. Suzuki, E. McDermott, and H. Isozaki. Training conditional random elds with multivariate evaluation measures. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 217 224, Sydney, Australia, July 2006. Association for Computational Linguistics. 32) Y. Tabei and K. Asai. A local multiple alignment method for detection of non-coding RNA sequences. Bioinformatics, 25:1498 1505, Jun 2009. 20 6