Sparse Modeling and Model Selection

Σχετικά έγγραφα
MIA MONTE CARLO ΜΕΛΕΤΗ ΤΩΝ ΕΚΤΙΜΗΤΩΝ RIDGE ΚΑΙ ΕΛΑΧΙΣΤΩΝ ΤΕΤΡΑΓΩΝΩΝ

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

Fundamentals of Signal Processing for Communications Systems

Lecture 34: Ridge regression and LASSO

Buried Markov Model Pairwise

Probabilistic Approach to Robust Optimization

J. of Math. (PRC) Banach, , X = N(T ) R(T + ), Y = R(T ) N(T + ). Vol. 37 ( 2017 ) No. 5

HMY 795: Αναγνώριση Προτύπων

Chinese Journal of Applied Probability and Statistics Vol.28 No.3 Jun (,, ) 应用概率统计 版权所用,,, EM,,. :,,, P-. : O (count data)

552 Lee (2006),,, BIC,. : ; ; ;. 2., Poisson (Zero-Inflated Poisson Distribution), ZIP. Y ZIP(φ, λ), φ + (1 φ) exp( λ), y = 0; P {Y = y} = (1 φ) exp(

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

Modern Bayesian Statistics Part III: high-dimensional modeling Example 3: Sparse and time-varying covariance modeling

Feasible Regions Defined by Stability Constraints Based on the Argument Principle


CSJ. Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity

Nondifferentiable Convex Functions

Chapter 1 Introduction to Observational Studies Part 2 Cross-Sectional Selection Bias Adjustment

476,,. : 4. 7, MML. 4 6,.,. : ; Wishart ; MML Wishart ; CEM 2 ; ;,. 2. EM 2.1 Y = Y 1,, Y d T d, y = y 1,, y d T Y. k : p(y θ) = k α m p(y θ m ), (2.1

46 2. Coula Coula Coula [7], Coula. Coula C(u, v) = φ [ ] {φ(u) + φ(v)}, u, v [, ]. (2.) φ( ) (generator), : [, ], ; φ() = ;, φ ( ). φ [ ] ( ) φ( ) []

Introduction to Risk Parity and Budgeting

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Research on model of early2warning of enterprise crisis based on entropy

NOB= Dickey=Fuller Engle-Granger., P. ( ). NVAR=Engle-Granger/Dickey-Fuller. 1( ), 6. CONSTANT/NOCONST (C) Dickey-Fuller. NOCONST NVAR=1. TREND/NOTREN

Βιογραφικό Σημείωμα. Διεύθυνση επικοινωνίας: Τμήμα Μαθηματικών, Πανεπιστήμιο Πατρών

3 : 373 R-LSR-TLS TSVD Tikhonov Tikhonov Ax b, A R m n,b R n,m n (1) min Ax-b Lx δ (5),A ;b ;x,δ ;L 1 b [9] A Lagrange min Ax-b = Δb Ax=b+Δb () L ( x,

ΑΡΑΙΑ ΟΙΚΟΝΟΜΕΤΡΙΚΑ ΜΟΝΤΕΛΑ ΥΨΗΛΩΝ ΔΙΑΣΤΑΣΕΩΝ

Research on Economics and Management

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

Applying Markov Decision Processes to Role-playing Game

Optimization, PSO) DE [1, 2, 3, 4] PSO [5, 6, 7, 8, 9, 10, 11] (P)

Fourier transform, STFT 5. Continuous wavelet transform, CWT STFT STFT STFT STFT [1] CWT CWT CWT STFT [2 5] CWT STFT STFT CWT CWT. Griffin [8] CWT CWT

Anomaly Detection with Neighborhood Preservation Principle

ADT


y = f(x)+ffl x 2.2 x 2X f(x) x x p T (x) = 1 Z T exp( f(x)=t ) (2) x 1 exp Z T Z T = X x2x exp( f(x)=t ) (3) Z T T > 0 T 0 x p T (x) x f(x) (MAP = Max

. (1) 2c Bahri- Bahri-Coron u = u 4/(N 2) u

Stabilization of stock price prediction by cross entropy optimization

Bayesian Discriminant Feature Selection

255 (log-normal distribution) 83, 106, 239 (malus) 26 - (Belgian BMS, Markovian presentation) 32 (median premium calculation principle) 186 À / Á (goo

ΟΙΚΟΝΟΜΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ

90 [, ] p Panel nested error structure) : Lagrange-multiple LM) Honda [3] LM ; King Wu, Baltagi, Chang Li [4] Moulton Randolph ANOVA) F p Panel,, p Z

Εργαστήριο Επεξεργασίας Σηµάτων και Τηλεπικοινωνιών Κινητά ίκτυα Επικοινωνιών

1 n-gram n-gram n-gram [11], [15] n-best [16] n-gram. n-gram. 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e)


ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ. Παναγιώτης Μερκούρης ΕΚΠΑΙΔΕΥΣΗ

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

Contents QR Decomposition: An Annotated Bibliography Introduction to Adaptive Filters

Homework 1: Solutions

ER-Tree (Extended R*-Tree)

Mantel & Haenzel (1959) Mantel-Haenszel

Supplementary Appendix

Βιογραφικό Σημείωμα. (τελευταία ενημέρωση 20 Ιουλίου 2015) 14 Ιουλίου 1973 Αθήνα Έγγαμος

FORMULAS FOR STATISTICS 1

Διαχείριση ενεργειακών πόρων & συστημάτων Πρακτικά συνεδρίου(isbn: )

SVM. Research on ERPs feature extraction and classification

Vol. 38 No Journal of Jiangxi Normal University Natural Science Nov. 2014

{takasu, Conditional Random Field

Βασίλειος Μαχαιράς Πολιτικός Μηχανικός Ph.D.

Introduction to the ML Estimation of ARMA processes

Biostatistics for Health Sciences Review Sheet

5 Haar, R. Haar,. Antonads 994, Dogaru & Carn Kerkyacharan & Pcard 996. : Haar. Haar, y r x f rt xβ r + ε r x β r + mr k β r k ψ kx + ε r x, r,.. x [,

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ ΣΤΥΛΙΑΝΗΣ Κ. ΣΟΦΙΑΝΟΠΟΥΛΟΥ Αναπληρώτρια Καθηγήτρια. Τµήµα Τεχνολογίας & Συστηµάτων Παραγωγής.

Table 1: Military Service: Models. Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 num unemployed mili mili num unemployed

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

JOURNAL OF APPLIED SCIENCES Electronics and Information Engineering. Cyclic MUSIC DOA TN (2012)

SECTION II: PROBABILITY MODELS

Summary of the model specified

Bundle Adjustment for 3-D Reconstruction: Implementation and Evaluation


Yahoo 2. SNS Social Networking Service [3,5,12] Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited.

ΠΑΡΑΓΟΝΤΙΚΗ ΑΝΑΛΥΣΗ FACTOR ANALYSIS

Quick algorithm f or computing core attribute

Queensland University of Technology Transport Data Analysis and Modeling Methodologies

1530 ( ) 2014,54(12),, E (, 1, X ) [4],,, α, T α, β,, T β, c, P(T β 1 T α,α, β,c) 1 1,,X X F, X E F X E X F X F E X E 1 [1-2] , 2 : X X 1 X 2 ;

Βιογραφικό Σημείωμα. Διδακτορικό Δίπλωμα, Τμήμα Στατιστικής και Ασφαλιστικής Επιστήμης, Πανεπιστήμιο Πειραιά, 3/2009

Vol.4-DCC-8 No.8 Vol.4-MUS-5 No.8 4// 3 3 Hanning (T ) 3 Hanning 3T (y(t)w(t)) dt =.5 T y (t)dt. () STRAIGHT F 3 TANDEM-STRAIGHT[] 3 F F 3 [] F []. :

Statistical analysis of extreme events in a nonstationary context via a Bayesian framework. Case study with peak-over-threshold data

172,,,,. P,. Box (1980)P, Guttman (1967)Rubin (1984)P, Meng (1994), Gelman(1996)De la HorraRodriguez-Bernal (2003). BayarriBerger (2000)P P.. : Casell

ΓΡΑΜΜΙΚΟΣ & ΔΙΚΤΥΑΚΟΣ ΠΡΟΓΡΑΜΜΑΤΙΣΜΟΣ

On the linear convergence of the general alternating proximal gradient method for convex minimization


APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ ΠΡΟΣΩΠΙΚΑ ΣΤΟΙΧΕΙΑ ΣΠΟΥΔΕΣ

Efficient Semiparametric Estimators via Preliminary Estimators for Kullback-Leibler Minimizers

Multilevel models for analyzing people s daily moving behaviour

Yoshifumi Moriyama 1,a) Ichiro Iimura 2,b) Tomotsugu Ohno 1,c) Shigeru Nakayama 3,d)

Supplementary Materials for Evolutionary Multiobjective Optimization Based Multimodal Optimization: Fitness Landscape Approximation and Peak Detection

ACTA MATHEMATICAE APPLICATAE SINICA Nov., ( µ ) ( (

Gradient Descent for Optimization Problems With Sparse Solutions

Gap Safe Screening Rules for Sparse-Group Lasso

Στοιχεία εισηγητή Ημερομηνία: 10/10/2017

Εισαγωγή σε μεθόδους Monte Carlo Ενότητα 3: Δειγματοληπτικές μέθοδοι

«ΑΝΑΠΣΤΞΖ ΓΠ ΚΑΗ ΥΩΡΗΚΖ ΑΝΑΛΤΖ ΜΔΣΔΩΡΟΛΟΓΗΚΩΝ ΓΔΓΟΜΔΝΩΝ ΣΟΝ ΔΛΛΑΓΗΚΟ ΥΩΡΟ»

Εφαρμογή Υπολογιστικών Τεχνικών στην Γεωργία

High order interpolation function for surface contact problem

ΠΡΟΤΕΙΝΟΜΕΝΑ ΘΕΜΑΤΑ ΔΙΠΛΩΜΑΤΙΚΩΝ ΕΡΓΑΣΙΩΝ ΓΙΑ ΤΟ ΕΑΡΙΝΟ ΕΞΑΜΗΝΟ Εισηγητής: Νίκος Πλόσκας Επίκουρος Καθηγητής ΤΜΠΤ

ΠΑΡΑΡΤΗΜΑ 1 (4 ου ΜΑΘΗΜΑΤΟΣ): ΠΑΡΑ ΕΙΓΜΑΤΑ ΕΛΕΓΧΩΝ ΥΠΟΘΕΣΕΩΝ, ΕΠΙΛΟΓΗΣ ΜΟΝΤΕΛΩΝ ΚΑΙ ΜΕΤΑΒΛΗΤΩΝ

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

, Snowdon. . Frahm.

Transcript:

15 Sparse Modeling and Model Selection L L L β β δ>0 limp(β β>δ)=0 n (β β)n (0, ) n

p =(,, ) n {(, )i=1,, n} =(,, ) X = (,, ) =(,, ) X X n X =0, j=1,, p. =0, 1 n =1, X =Xβ+ε. β=(β, β ) ε ε N (0, σ I ) σ β =arg min ( Xβ +λβ ) β λ 0 m a=(a,,a ) a = a λ=0 β λ β 0 λ=0 β =(X X ) X E[β ]=β (X X ) λ>0 β =(X X +λi ) X λ (X X +λi ) λ (X X λ+i ) β λ (X X )

β =arg min ( Xβ +λβ ). β λβ =λ β λβ = λ β β λβ β =0 X X X =ni X X ni n>p X X X =ni X X =ni {n(β 2β β )+λβ }+const. β =(β,,β ) β β β j=1,, p β =sign(β ) β λ A A>0 A = 0 2n. β =X n X n λ(2n) λ(2n) t t t β >s j s λ s t β=0 p k β k+1 RSS = Xβ X X =ni RSS= X A n A k+1 X A X A RSS X λ t λ(2n) t n p

X X ni β X ( Xβ )= λ 2 s s s β {sign(β s )} [ 1, 1] β 0, β =0. λ A={jβ 0} β A βa C β A=(X A X A) X A λ 2 sa, β A C =0. β A, s A β, s A λ β =0 β =0 λ λ λ =max* n * X j n j =arg max* n λ λ A={j } λ λ=λ s =1 j j j j A={j, j } λ λ A={j, j } λ λ* λ=λ* λ λ* s 1 1 λ=λ* s { 1, 1} β =0 s=±1 β =0 λ λ* λ=λ* s=±1 X λ=2n β β =0 s=±1 AA {j} λ λ* s { 1, 1} λ=λ* s 1 1 A AA{j} λ* β A A (XA X A) XA (XA X A) λ sa 2 0 β A (XA X A) p β β=(β 0, β 0 C ) S ={jβ 0} S ={1,, p}{s } S s s β 0 C 0 β 0

β C 0 =0 n (β 0 β 0 ) N 0, σ 0 0 =limx 0 X 0 n β 0 λ n n p p n p p n s n p R= 1 n Xβ Xβ R E[R]= 1 n EX (X X ) X ε = p n σ p E[R] S 0 n<p n>p p n E[R]= s n σ s p E[R] S S L L Xβ +λβ. β = 1{β 0} L λ=2σ C E[R] p λ=2σ (log p+log log p) E[R] 4s σ log p n log log p + n +O(n ) log p n log p E[R] 0 s σ 0 s E[R] L p L X ϕ β C 0 3β 0 β L

β 0 (β X Xβ)s ϕ β 0 s β 0 X X X X β C 0 3β 0 0<δ<1 λ=4σ 2n log(2pδ) 1 δ R Cs σ log p n 1 +O(n ) ϕ L n log p R 0 L X L X L sign(β )=sign(β) X β γc (X 0 C X 0 )(X 0 X 0 ) 1 γ min β >(λ)(2n) 0 C (λ)=λ[(x 0 X 0 n) + 4σ C ] m a a = maxa m m A=(A ) A =max λ>4σγ 2n log p A λ Λ (X 0 X 0 n)> C Λ ( ) C n P[sign(β )=sign(β)] 1 4 exp( c λ n) c p>n sign(β )=sign(β) X β X X

n<p n p s X X X L R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society : Series B Statistical Methodology, vol. 58, no. 1, pp. 267-288, 1996. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least angle regression with discussion, The Annals of Statistics, vol. 32, pp. 407-499, 2004. J. Friedman, T. Hastie, and R. Tibshirani, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, vol. 9, no. 3, pp. 432-441, 2008. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1-122, 2011. K. Knight and W. Fu, Asymptotics for lasso-type estimators, The Annals of Statistics, vol. 28, no. 5, pp. 1356-1378, 2000. P.J. Bickel, Y. Ritov, and A.B. Tsybakov, Simultaneous analysis of lasso and dantzig selector, The Annals of Statistics, vol. 37, no. 4, pp. 1705-1732, 2009. M.J. Wainwright, Sharp thresholds for high-dimensional and noisy sparsity recovery using-constrained quadratic programming lasso, IEEE Trans. Inf. Theory, vol. 55, no. 5, pp. 2183-2202, 2009. P. Zhao and B. Yu, On model selection consistency of lasso, J. Mach. Learn. Res., vol. 7, no. 2, p. 2541, 2007. J. Friedman, T. Hastie, and R. Tibshirani, Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, vol. 33, no. 1, pp. 1-22, 2010. M.Y. Park and T. Hastie, L1-regularization path algorithm for generalized linear models, Journal of the Royal Statistical Society : Series BStatistical Methodology, vol. 69, no. 4, pp. 659-677, 2007. N. Meinshausen and P. Bühlmann, High-dimensional graphs and variable selection with the lasso, The Annals of Statistics, vol. 34, no. 3, pp. 1436-1462, 2006. M. Yuan and Y. Lin, Model selection and estimation in the gaussian graphical model, Biometrika, vol. 94, no. 1, pp. 19-35, 2007. D.M. Witten, R. Tibshirani, and T. Hastie, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics, vol. 10, no. 3, p. kxp008, 2009. H. Zou, T. Hastie, and R. Tibshirani, Sparse principal component analysis, Journal of Computational and Graphical Statistics, vol. 15, no. 2, pp. 265-286, 2006. M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society : Series BStatistical Methodology, vol. 68, no. 1, pp. 49-67, 2006. R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, Sparsity and smoothness via the fused lasso, Journal of the Royal Statistical Society : Series BStatistical Methodology, vol. 67, no. 1, pp. 91-108, 2005. 2010 A.E. Hoerl and R.W. Kennard, Ridge regression : Biased estimation for nonorthogonal problems, Technometrics, vol. 12, no. 1, pp. 55-67, 1970. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed., Springer, New York, 2008. R. Tibshirani, Sparsity and the lasso, 2015, http:www.stat.cmu. edu ~ larry=smlsparsity. pdf R.J. Tibshirani, The lasso problem and uniqueness, Electronic Journal of Statistics, vol. 7, pp. 1456-1490, 2013. J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc., vol. 96, no. 456, pp. 1348-1360, 2001. H. Zou, The adaptive lasso and its oracle properties, J. Am. Stat. Assoc., vol. 101, no. 476, pp. 1418-1429, 2006. D.P. Foster and E.I. George, The risk inflation criterion for multiple regression, The Annals of Statistics, vol. 22, no. 4, pp. 1947-1975, 1994. S. Xiong, Better subset regression, Biometrika, vol. 101, no. 1, pp.

71-84, 2014. P. Bhlmann and S. van de Geer, Statistics for High-Dimensional Data : Methods, Theory and Applications, 1st ed., Springer Publishing Company, Incorporated, 2011. S. Foucart and H. Rauhut, A mathematical introduction to compressive sensing, Springer, 2013. vol. 1908, pp. 39-48, 2014. L. Chen and J.Z. Huang, Sparse reduced-rank regression for simultaneous dimension reduction and variable selection, J. Am. Stat. Assoc., vol. 107, no. 500, pp. 1533-1545, 2012. M. Vounou, T.E. Nichols, G. Montana, and A.D.N. Initiative, Discovering genetic associations with high-dimensional neuroimaging phenotypes : A sparse reduced-rank regression approach, Neuroimage, vol. 53, no. 3, pp. 1147-1159, 2010. R. Lockhart, J. Taylor, R.J. Tibshirani, and R. Tibshirani, A significance test for the lasso, The Annals of Statistics, vol. 42, no. 2, pp. 413-468, 2014. E.J. Candès, J. Romberg, and T. Tao, Robust uncertainty principles : Exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489-509, 2006. マンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンションマンション