Structural and Multidisciplinary Optimization P. Duysinx, P. Tossings* 2017-2018
* CONTACT Patricia TOSSINGS Institut de Mathematique B37), 0/57 Telephone: 04/366.93.73. Email: Patricia.Tossings@ulg.ac.be
TABLE OF CONTENTS Ch.1 - Introduction to Mathematical Programming Theory 1 Ch.2 - Algorithms for Unconstrained Optimization 51 Gradient Methods Ch.3 - Line Search Techniques Ch.4 - Algorithms for Unconstrained Optimization Newton, Newton-like and Quasi-Newton Methods Ch.5 - Quasi-Unconstrained Optimization Ch.6 - Linearly Constrained Optimization Ch.7 - General Constrained Optimization Dual Methods Ch.8 - General Constrained Optimization Transformation Methods
Chapter 3 LINE SEARCH TECHNIQUES 1
2
REMINDER: BASIC DESCENT ALG. for UNC. OPT. STEP 1. Initialization Choose x 0 R n Set k = 0 STEP 2. Direction finding Compute s k such that [s k] T f x k) < 0 STEP 3. Line search Find α k such that f OPTIONAL x k + α k s k) = min f x k + αs k) α 0 STEP 4. Update Set x k+1 = x k + α k s k STEP 5. Convergence check - Satisfied STOP - Unsatisfied Set k := k + 1 and go to STEP 2 3
The present section of the course deals with STEP 3. Line search φ Find α k such that α k) = min φα) where φα) = f x k + αs k) α 0 NOTE. In the context of descent methods Line search techniques has to be efficient Repeated function and gradient evaluations) Line search techniques has not to be exact Line search techniques has to work with a positivity constraint 4
Line search techniques using derivatives HYPOTHESIS. φ is unimodal see next slide) and at least C 2 OPTIMALITY CONDITION. φ α) = 0 1) CONVENTION. The solution of 1) is denoted by α Newton-Raphson method Secant method Dichotomy methods with derivatives) Line search techniques without derivatives Search a minimizer of φ without using optimality conditions Quadratic interpolation 5
DEFINITION : Unimodal function. A function φ is unimodal over [A,B] R if it admits a minimizer α [A,B] and, for any α 1,α 2 [A,B] such that α 1 < α 2 α 2 α φα 1 ) > φα 2 ) α 1 α φα 1 ) < φα 2 ) 6
Newton-Raphson method Newton method applied to the generally nonlinear equation φ α) = 0 see Numerical Analysis) PRINCIPLE Iterative method. At iteration k, φ α) is approximated by its tangent y = φ α k) + [α α k] φ α k) and α k+1 is chosen as the intersection of this straight line with Ox, i.e. α k+1 = α k φ α k) φ α k) 7
Extract from M. Minoux, Programmation mathematique 8
Newton-Raphson method - Strength Applied to a quadratic function Proof φα) = aα 2 + bα + c a 0) the method gives the solution of in one step. For any α 0, we have φ α) = 0 1) When a > 0, this solution corresponds to the minimum of φ. and, as a consequence, φ α 0) = 2aα 0 + b and φ α 0) = 2a α 1 = α 0 2aα0 + b 2a = b 2a Consequence. More precisely. The method has a good asymptotic behaviour. The method is quadratically convergent for initial points chosen sufficiently close to the solution of 1). 9
Newton-Raphson method - Weaknesses - First and second derivatives are required at each iteration. - Global convergence is not guaranteed. Extract from M. Minoux, Programmation mathematique 10
Secant method PRINCIPLE In the Newton method, replace φ α k) by the following approximation This leads to φ α k) φ α k 1) α k α k 1 α k+1 = α k φ α k) α k α k 1 φ α k) φ α k 1) INTERPRETATION At iteration k, φ α) is approximated no more by its tangent but well by the straight line passing through the points α k 1,φ α k 1)) and α k,φ α k)) Linear interpolation) 11
Extract from M. Minoux, Programmation mathematique 12
Secant method - Strength and Weaknesses - The method is not globally convergent BUT... - Asymptotic behaviour Assuming that φ is C 3 at α and φ α ) > 0, for well chosen initial points, the method is p-superlinearly convergent with p = 1.618 the golden number) 13
Secant method - Other possible interpretations As is equivalent to α k+1 = α k φ α k) α k α k 1 φ α k) φ α k 1) α k+1 = α k φ α k) φ α k) φ α k 1) [ α k α k 1] the secant method can be seen as a dichotomy method see below) with ρ k φ α k) = φ α k) φ α k 1) False position or regula falsi) The secant method can also be seen as a quadratic interpolation method in which φ is approximated by a quadric passing through the point α k,φ α k)) and having the same derivatives as φ at the points α k and α k 1. 14
Dichotomy methods with derivatives) For the sake of simplicity, we assume that φ 0) < 0 line search used in the context of descent methods) There exists α such that α α φ α) > 0 In that case, there is at least one point α such that φ α ) = 0. Most of the time, α is a local minimizer of φ over [0,+ [ If φ is unimodal, then α is the unique global minimizer of φ over [0, + [ 15
Dichotomy methods PRINCIPLE Determine a first interval [α min,α max ] such that φ α min ) < 0 and φ α max ) > 0 Reduce the size of this interval until the desired precision level. MORE PRECISELY At iteration k, compute φ α) at the point α = α k defined by α k = ρ k α min + 1 ρ k) α max = α max ρ k [α max α min ], ρ k ]0,1[ - If φ α k) < 0, set α min := α k - If φ α k) > 0, set α max := α k AN INTERESTING PARTICULAR CASE For ρ k = 0.5 bisection), the method converges linearly with a convergence ratio equal to 0.5 16
Dichotomy methods TO DETERMINE THE FIRST INTERVAL [α min,α max ] 1) Choose a step size h and set α min = 0 2) Compute φ h) - If φ h) < 0, set α min h, h 2h and go back to 2) - If φ h) > 0, set α max = h. STOP Note - h too large gives directly [α min,α max ] = [0,h] BUT leads to many iterations of the dichotomy process. - h too small implies many steps to find [α min,α max ] BUT leads to less iterations of the dichotomy process. Find the appropriate compromise. 17
A method without derivatives: Quadratic interpolation PRINCIPLE Let α 1, α 2 and α 3 be three values of α such that α 1 < α 2 < α 3 and φα 1 ) φα 2 ), φα 3 ) φα 2 ) Approximate φ on the interval [α 1,α 3 ] by the quadric passing through the points α 1,φα 1 )), α 2,φα 2 )) and α 3,φα 3 )) The equation of this quadric is given by qα) = 3 i=1 φα i ) j i j i α α j ) αi α j ) 18
The minimum of q is realized at where α 4 = 1 2 r 23 φα 1 ) + r 31 φα 2 ) + r 12 φα 3 ) s 23 φα 1 ) + s 31 φα 2 ) + s 12 φα 3 ) r i j = α 2 i α2 j and s i j = α i α j α 4 is taken as an approximate of the optimum of φ on the interval [α 1,α 3 ] The procedure is repeated with the three new points α 1, α 2, α 3 = α 2, α 4, α 3 if α 2 < α 4 < α 3 and φα 4 ) φα 2 ) = α 1, α 2, α 4 if α 2 < α 4 < α 3 and φα 4 ) > φα 2 ) = α 1, α 4, α 2 if α 1 < α 4 < α 2 and φα 4 ) φα 2 ) = α 4, α 2, α 3 if α 1 < α 4 < α 2 and φα 4 ) > φα 2 ) 19
Quadratic interpolation INITIALIZATION OF THE PROCESS Aim For a given, determine three values a < b < c of α satisfying b a = c b = φb) φa) and φb) φc) Method Choose an initial value α 0 and an arbitrary step size δ. Set α 1 = α 0 + δ and compute φ α 0) and φ α 1) Two situations can occur. First situation φ α 1) φ α 0) Set α k = α k 1 + 2 k 1 δ and compute φ α k) as far as φ decreases. Stop the process at the first iteration say p) for which φ begins to increase. 20
φ α 0) φ α 1)... φ α p 1) BUT φ α p 1) < φα p ) Compute finally φ α p+1) for α p+1 = α p 2 p 2 δ The procedure leads to four points α p 2 < α p 1 < α p+1 < α p separated by the same distance = 2 p 2 δ The points a, b, c are obtained by eliminating - α p if φ α p 1) < φ α p+1) - α p 2 if φ α p+1) < φ α p 1) Second situation φ α 1) > φ α 0) Replace α 0 by α 1 and δ by δ and proceed as above. 21
Quadratic interpolation CONVERGENCE RESULTS The global convergence of the method is ensured for any continuous and unimodal function φ. Asymptotic behaviour For a C 3 function, the convergence is p-superlinear with p = 1.3 VARIANTS Use other polynomial interpolations cubic, for example) or take the derivatives into account see the secant method presented above). 22