Syntax Analysis Part V

Σχετικά έγγραφα
Syntax Analysis Part IV

Μεταγλωττιστές. Δημήτρης Μιχαήλ. Ακ. Έτος Ανοδικές Μέθοδοι Συντακτικής Ανάλυσης. Τμήμα Πληροφορικής και Τηλεματικής Χαροκόπειο Πανεπιστήμιο

Syntax Analysis Part I

Chap. 6 Pushdown Automata

The Simply Typed Lambda Calculus

The challenges of non-stable predicates

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Chapter 4 (c) parsing

C.S. 430 Assignment 6, Sample Solutions

Top down vs. bottom up parsing. Top down vs. bottom up parsing

2 Composition. Invertible Mappings

Fractional Colorings and Zykov Products of graphs

Γλώσσες Προγραμματισμού Μεταγλωττιστές. Συντακτική Ανάλυση II

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

EE512: Error Control Coding

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

TMA4115 Matematikk 3

Lecture 2. Soundness and completeness of propositional logic

derivation of the Laplacian from rectangular to spherical coordinates

Instruction Execution Times

Math 6 SL Probability Distributions Practice Test Mark Scheme

Block Ciphers Modes. Ramki Thurimella

Other Test Constructions: Likelihood Ratio & Bayes Tests

Physical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible.

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Πρόβλημα 1: Αναζήτηση Ελάχιστης/Μέγιστης Τιμής

4.6 Autoregressive Moving Average Model ARMA(1,1)

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Example Sheet 3 Solutions

Section 8.3 Trigonometric Equations

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

Dynamic types, Lambda calculus machines Section and Practice Problems Apr 21 22, 2016

Concrete Mathematics Exercises from 30 September 2016

Homework 3 Solutions

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

From the finite to the transfinite: Λµ-terms and streams

Section 7.6 Double and Half Angle Formulas

About these lecture notes. Simply Typed λ-calculus. Types

Μηχανική Μάθηση Hypothesis Testing

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 24/3/2007

Συστήματα Διαχείρισης Βάσεων Δεδομένων

Capacitors - Capacitance, Charge and Potential Difference

Srednicki Chapter 55

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Right Rear Door. Let's now finish the door hinge saga with the right rear door

Every set of first-order formulas is equivalent to an independent set

ΚΥΠΡΙΑΚΟΣ ΣΥΝΔΕΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY 21 ος ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ Δεύτερος Γύρος - 30 Μαρτίου 2011

CRASH COURSE IN PRECALCULUS

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Assalamu `alaikum wr. wb.

Tridiagonal matrices. Gérard MEURANT. October, 2008

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 10η: Basics of Game Theory part 2 Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

ST5224: Advanced Statistical Theory II

Bounding Nonsplitting Enumeration Degrees

Εγκατάσταση λογισμικού και αναβάθμιση συσκευής Device software installation and software upgrade

LECTURE 2 CONTEXT FREE GRAMMARS CONTENTS

9.09. # 1. Area inside the oval limaçon r = cos θ. To graph, start with θ = 0 so r = 6. Compute dr

Math221: HW# 1 solutions

Matrices and Determinants

Models for Probabilistic Programs with an Adversary

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Homomorphism in Intuitionistic Fuzzy Automata

Στο εστιατόριο «ToDokimasesPrinToBgaleisStonKosmo?» έξω από τους δακτυλίους του Κρόνου, οι παραγγελίες γίνονται ηλεκτρονικά.

Space-Time Symmetries

Section 9.2 Polar Equations and Graphs

department listing department name αχχουντσ ϕανε βαλικτ δδσϕηασδδη σδηφγ ασκϕηλκ τεχηνιχαλ αλαν ϕουν διξ τεχηνιχαλ ϕοην µαριανι

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Modbus basic setup notes for IO-Link AL1xxx Master Block

Elements of Information Theory

14 Lesson 2: The Omega Verb - Present Tense

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

PARTIAL NOTES for 6.1 Trigonometric Identities

Second Order RLC Filters

How to register an account with the Hellenic Community of Sheffield.

Locking to ensure serializability

Approximation of distance between locations on earth given by latitude and longitude

Advanced Subsidiary Unit 1: Understanding and Written Response

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

LTL to Buchi. Overview. Buchi Model Checking LTL Translating LTL into Buchi. Ralf Huuck. Buchi Automata. Example

Practice Exam 2. Conceptual Questions. 1. State a Basic identity and then verify it. (a) Identity: Solution: One identity is csc(θ) = 1

Abstract Storage Devices

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

Formal Semantics. 1 Type Logic

Finite Field Problems: Solutions

Reminders: linear functions

Business English. Ενότητα # 9: Financial Planning. Ευαγγελία Κουτσογιάννη Τμήμα Διοίκησης Επιχειρήσεων

Solutions to the Schrodinger equation atomic orbitals. Ψ 1 s Ψ 2 s Ψ 2 px Ψ 2 py Ψ 2 pz

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 11/3/2006

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

Inverse trigonometric functions & General Solution of Trigonometric Equations

ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΣΧΟΛΗ ΓΕΩΤΕΧΝΙΚΩΝ ΕΠΙΣΤΗΜΩΝ ΚΑΙ ΔΙΑΧΕΙΡΙΣΗΣ ΠΕΡΙΒΑΛΛΟΝΤΟΣ. Πτυχιακή εργασία

UNIVERSITY OF CALIFORNIA. EECS 150 Fall ) You are implementing an 4:1 Multiplexer that has the following specifications:

Transcript:

Syntax Analysis Part V Chapter 4: Bottom-Up Parsing Slides adapted from : Robert van Engelen, Florida State University

LR Parsers LR parsers are table-driven algorithms, much like the LL parsers The parse table is used to detect handles A CFG for which we can construct a deterministic parse table is called an LR grammar

LR Parsers The parser maintains states to keep track of where we stand in the parsing process The parser uses a parse table to Update parser states Uniquely detect handles when they appear at the top of the stack

LR(0) Items An LR(0) item is any production with a dot at some position in the right-hand side Thus, a production A X Y Z has four items: [A X Y Z] [A X Y Z] [A X Y Z] [A X Y Z ] Production A ε has a single item [A ]

LR(0) Items The bullet in an item [A α β] marks the portion α of the right-hand side that has already been processed Convention: We add to the grammar a new starting symbol S, and a production of the form S S

LR States States are sets of LR(0) items State I (see figure) records that We have seen A of C A B We have seen ε of B a State I: [C A B] [B a] A C B a

LR States States can represent several alternative trees at a time State I: [E T ] [T T * F] { E T }, T T * F

LR States Trees might share a common left part State I: [T T * F] [F ( F )] [F id] { T T * F id, T T * F } ( ) F

closure() closure(i) takes as input current state I closure(i) returns updated state by applying all possible rule expansions We always have I closure(i)

closure() 1. Start with closure(i) = I 2. If [A α Bβ] closure(i) then for each production B γ in the grammar, add the item [B γ] to I if not already in I 3. Repeat 2 until no new items can be added

Example closure({ [E E] }) = { [E E] } Add [E γ] Grammar: E E + T T T T * F F F ( E ) F id { [E E] [E E + T] [E T]} Add [T γ] { [E E] [E E + T] [E T] [T T * F] [T F] } Add [F γ] { [E E] [E E + T] [E T] [T T * F] [T F] [F ( E )] [F id] }

Kernel Items Kernel items: the initial item [S S] and all items [B α γ] with dot not at the left of right-hand side Nonkernel items: items [B γ] with dot at the left of right-hand side Function closure adds only nonkernel items

goto() goto(i, X) takes as input Current state I Processed grammar symbol X goto(i, X) returns updated state by Moving bullet over processed symbol X Discarding items with symbols other than X after the bullet Applying closure() function

goto() 1. For each [A α X β] I, add [A αx β] to goto(i, X), if not already there 2. Compute closure() of the resulting set Intuitively, goto(i, X) is the set of items that are valid for the viable prefix γx when I is the set of items that are valid for γ

Example Suppose I = { [E E], [E E + T], [E T], [T T * F], [T F], [F ( E )], [F id] } Then goto(i, E) = closure({[e E ], [E E + T]}) = { [E E ], [E E + T] } Grammar: E E + T T T T * F F F ( E ) F id

Example Suppose I = { [E E ], [E E + T] } Then goto(i,+) = closure({[e E + T]}) = { [E E + T], [T T * F], [T F], [F ( E )], [F id] } Grammar: E E + T T T T * F F F ( E ) F id

DFA for Shift/Reduce Decisions Functions closure() and goto() used to construct the set of all possible parsing states, along with their connections This results in a deterministic finite automaton which is used for shift/reduce decisions

DFA for Shift/Reduce Decisions start 0 C a A 1 2 3 B Grammar: S C C A B A a B a Can only reduce A a (not B a) a 4 5 goto(i 0,C) State I 0 : S C C A B A a goto(i 0,a) State I 1 : S C goto(i 0,A) State I 3 : A a State I 2 : C A B B a State I 4 : C A B goto(i 2,B) goto(i 2,a) State I 5 : B a

DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 0 : S C C A B A a goto(i 0,a) The states of the DFA are used to determine if a handle is on top of the stack State I 3 : A a Stack $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ a$ a$ $ $ $ Next Action shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C)

DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 0 : S C C A B A a goto(i 0,A) The states of the DFA are used to determine if a handle is on top of the stack State I 2 : C A B B a Stack $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ a$ a$ $ $ $ Next Action shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C)

DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 2 : C A B B a goto(i 2,a) The states of the DFA are used to determine if a handle is on top of the stack State I 5 : B a Stack $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ a$ a$ $ $ $ Next Action shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C)

DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 2 : C A B B a goto(i 2,B) The states of the DFA are used to determine if a handle is on top of the stack State I 4 : C A B Stack $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ a$ a$ $ $ $ Next Action shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C)

DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 0 : S C C A B A a The states of the DFA are used to determine if a handle is on top of the stack goto(i 0,C) State I 1 : S C Stack $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ a$ a$ $ $ $ Next Action shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C)

DFA for Shift/Reduce Decisions Grammar: S C C A B A a B a State I 0 : S C C A B A a The states of the DFA are used to determine if a handle is on top of the stack goto(i 0,C) State I 1 : S C Stack $ 0 $ 0 a 3 $ 0 A 2 $ 0 A 2 a 5 $ 0 A 2 B 4 $ 0 C 1 Input aa$ a$ a$ $ $ $ Next Action shift (and goto state 3) reduce A a (goto 2) shift (goto 5) reduce B a (goto 4) reduce C AB (goto 1) accept (S C)

Model of an LR Parser input a 1 a 2 a i a n $ stack s m X m s m-1 LR Parsing Program (driver) output X m-1 s 0 action shift reduce accept error goto DFA Constructed with LR(0) method, SLR method, LR(1) method, or LALR(1) method

LR Parsing (Driver) Configuration ( = LR parser state): ($ s 0 X 1 s 1 X 2 s 2 X m s m, a i a i+1 a n $) stack input if action[s m, a i ] = shift s then push a i, push s, and advance input: ($ s 0 X 1 s 1 X 2 s 2 X m s m a i s, a i+1 a n $) if action[s m, a i ] = reduce A β and goto[s m-r, A] = s with r = β then pop 2r symbols, push A, and push s: ($ s 0 X 1 s 1 X 2 s 2 X m-r s m-r A s, a i a i+1 a n $) if action[s m,a i ] = accept then stop if action[s m,a i ] = error then attempt recovery

Example LR Parse Table Grammar: 1. E E + T 2. E T 3. T T * F 4. T F 5. F ( E ) 6. F id state 0 1 2 3 4 5 action id + * ( ) $ s5 s4 s6 acc r2 s7 r2 r2 r4 r4 r4 r4 s5 s4 r6 r6 r6 r6 goto E T F 1 2 3 8 2 3 6 s5 s4 9 3 shift & goto 5 7 s5 s4 10 8 s6 s11 reduce by production #1 9 10 11 r1 s7 r1 r1 r3 r3 r3 r3 r5 r5 r5 r5

Example LR Parsing Grammar: 1. E E + T 2. E T 3. T T * F 4. T F 5. F ( E ) 6. F id Stack $ 0 $ 0 id 5 $ 0 F 3 $ 0 T 2 $ 0 T 2 * 7 $ 0 T 2 * 7 id 5 $ 0 T 2 * 7 F 10 $ 0 T 2 $ 0 E 1 $ 0 E 1 + 6 $ 0 E 1 + 6 id 5 $ 0 E 1 + 6 F 3 $ 0 E 1 + 6 T 9 $ 0 E 1 Input id*id+id$ *id+id$ *id+id$ *id+id$ id+id$ +id$ +id$ +id$ +id$ id$ $ $ $ $ Next Action shift 5 reduce 6 goto 3 reduce 4 goto 2 shift 7 shift 5 reduce 6 goto 10 reduce 3 goto 2 reduce 2 goto 1 shift 6 shift 5 reduce 6 goto 3 reduce 4 goto 9 reduce 1 goto 1 accept

Constructing the set of LR(0) States of a Grammar 1. The grammar is augmented with a new start symbol S and unique production S S 2. Initially, set C = { closure({[s S]}) } (this is the start state of the DFA) 3. For each set of items I C and each grammar symbol X (N T) such that goto(i, X) C and goto(i, X), add the set of items goto(i, X) to C 4. Repeat 3 until no more sets can be added to C

LR(0) Grammars Construct FA automaton based on LR(0) states and goto() function If there is no conflict in action table then grammar is LR(0) Shift-reduce parser based on LR(0) table is too restricted and never used; need to be extended

SLR Grammars SLR (Simple LR): a simple extension of LR(0) shift-reduce parsing SLR eliminates some LR(0) conflicts by populating the parsing table with reductions A α on symbols in FOLLOW(A) S E E id + E E id State I 0 : S E E id + E E id State I 2 : Shift on + goto(i 0, id) E id + E E id goto(i 2,+) FOLLOW(E) = {$} thus reduce on $

SLR Parsing Table Reductions do not fill entire rows Otherwise the same as LR(0) id + $ E 1. S E 2. E id + E 3. E id 0 1 2 3 s2 s2 s3 acc r3 1 4 Shift on + 4 r2 FOLLOW(E) = {$} thus reduce on $

SLR Parsing Build the LR(0) DFA Construct LR(0) items (states) Determine transitions Construct the SLR parsing table from the DFA LR parser program uses the SLR parsing table to determine shift/reduce operations

Constructing SLR Parsing Tables 1. Augment the grammar with S S 2. Construct the set C = {I 0, I 1,, I n } of LR(0) states 3. If [A α aβ] I i and goto(i i, a) = I j then set action[i, a] = shift j 4. If [A α ] I i then set action[i, a] = reduce A α for all a FOLLOW(A) (apply only if A S ) 5. If [S S ] is in I i then set action[i, $] = accept 6. If goto(i i, A) = I j then set goto[i, A] = j 7. Repeat 3-6 until no more entries added 8. The initial state i is the I i holding item [S S]

Example SLR Grammar Augmented grammar: 1. C C 2. C A B 3. A a 4. B a start and LR(0) States I 0 = closure({[c C]}) I 1 = goto(i 0, C) = closure({[c C ]}) goto(i 0, C) State I 0 : C C C A B A a goto(i 0, a) State I 1 : C C goto(i 0, A) State I 3 : A a final State I 2 : C A B B a State I 4 : C A B goto(i 2, B) goto(i 2, a) State I 5 : B a

Example SLR Parsing Table State I 0 : C C C A B A a State I 1 : C C State I 2 : C A B B a State I 3 : A a State I 4 : C A B State I 5 : B a a $ C A B start 0 C a A 1 2 3 B a 4 5 0 1 2 3 4 5 s3 s5 r3 acc r2 r4 1 2 4 Grammar: 1. C C 2. C A B 3. A a 4. B a

SLR and Ambiguity I 0 : S S S L=R S R L *R L id R L Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR Consider for example the unambiguous grammar S L = R R L * R id R L I 1 : S S I 2 : S L =R R L action[2,=]=s6 action[2,=]=r5 I 3 : S R I 4 : L * R R L L *R L id no Has no SLR parsing table I 5 : L id I 6 : S L= R R L L *R L id I 7 : L *R I 8 : R L I 9 : S L=R

Viable Prefixes The LR(0) automaton recognizes strings of grammar symbols that can appear on the stack of a shift reduce parser that works on rightmost derivations If stack holds α and and buffer is x, then S α x * rm

Viable Prefixes A viable prefix is a prefix of a rightmost sentential form that does not continue past the end of the rightmost handle of that form LR(0) automata recognize viable prefixes The LR(0) automaton is the deterministic version of a nondeterministic automaton with ε transitions whose states are LR(0) items

LR(1) Grammars SLR too simple LR(1) parsing uses lookahead to avoid unnecessary conflicts in parsing table LR(1) item = LR(0) item + lookahead LR(0) item: [A α β] LR(1) item: [A α β, a]

SLR Versus LR(1) Split the SLR states by adding LR(1) lookahead Unambiguous grammar 1. S L = R 2. R 3. L * R 4. id 5. R L Lookahead : = S L =R action[2,=]=s6 I 2 : S L =R R L R L split Should not reduce on =, because no right-sentential form begins with R= Lookahead : $ action[2,$]=r5

LR(1) Items An LR(1) item [A α β, a] contains a lookahead terminal a, meaning: with α already on top of the stack, expect to see βa For items of the form [A α, a] the lookahead a is used to reduce A α only if the next input is a For items of the form [A α β, a] with β ε the lookahead has no effect

The Closure Operation for LR(1) Items 1. Start with closure(i) = I 2. If [A α Bβ, a] closure(i) then for each production B γ in the grammar and each terminal b FIRST(βa), add the item [B γ, b] to I if not already in I 3. Repeat 2 until no new items can be added

The Goto Operation for LR(1) Items 1. For each item [A α Xβ, a] I, add [A αx β, a] to goto(i, X), if not already there 2. Repeat step 1 until no more items can be added 3. Compute closure() of the resulting set

Constructing the set of LR(1) Items of a Grammar 1. Augment the grammar with a new start symbol S and production S S 2. Initially, set C = { closure({[s S, $]}) } (start state of the DFA) 3. For each set of items I C and each grammar symbol X (N T) such that goto(i,x) C and goto(i,x), add the set of items goto(i,x) to C 4. Repeat 3 until no more sets can be added to C

Example Grammar and LR(1) Items Unambiguous LR(1) grammar: S L = R R L * R id R L Augment with S S LR(1) items (next slide)

I 0 : I 1 : I 2 : I 3 : I 4 : I 5 : [S S, $] goto(i 0,S)=I 1 [S L=R, $] goto(i 0,L)=I 2 [S R, $] goto(i 0,R)=I 3 [L *R, =/$] goto(i 0,*)=I 4 [L id, =/$] goto(i 0,id)=I 5 [R L, $] goto(i 0,L)=I 2 [S S, $] [S L =R, $] goto(i 2,=)=I 6 [R L, $] [S R, $] [L * R, =/$] goto(i 4,R)=I 7 [R L, =/$] goto(i 4,L)=I 8 [L *R, =/$] goto(i 4,*)=I 4 [L id, =/$] goto(i 4,id)=I 5 [L id, =/$] I 6 : I 7 : I 8 : I 9 : I 10 : I 11 : I 12 : I 13 : [S L= R, $] goto(i 6,R)=I 9 [R L, $] goto(i 6,L)=I 10 [L *R, $] goto(i 6,*)=I 11 [L id, $] goto(i 6,id)=I 12 [L *R, =/$] [R L, =/$] [S L=R, $] [R L, $] [L * R, $] goto(i 11,R)=I 13 [R L, $] goto(i 11,L)=I 10 [L *R, $] goto(i 11,*)=I 11 [L id, $] goto(i 11,id)=I 12 [L id, $] [L *R, $] shorthand for two items [R L, =] [R L, $]

Constructing Canonical LR(1) Parsing Tables 1. Augment the grammar with S S 2. Construct the set C={I 0,I 1,,I n } of LR(1) items 3. If [A α aβ, b] I i and goto(i i,a)=i j then set action[i,a]=shift j 4. If [A α, a] I i then set action[i,a]=reduce A α (apply only if A S ) 5. If [S S, $] is in I i then set action[i,$]=accept 6. If goto(i i,a)=i j then set goto[i,a]=j 7. Repeat 3-6 until no more entries added 8. The initial state i is the I i holding item [S S,$]

0 Example LR(1) Parsing Table id * = $ s5 s4 S L R 1 2 3 1 acc 2 s6 r6 Grammar: 1. S S 2. S L = R 3. S R 4. L * R 5. L id 6. R L 3 4 5 6 7 8 9 s5 s12 s4 s11 r5 r4 r6 r3 r5 r4 r6 r2 8 7 10 4 10 r6 11 s12 s11 10 13 12 r5 13 r4

LALR(1) Grammars LR(1) parsing tables have many states LALR(1) parsing (Look-Ahead LR) merges LR(1) states to reduce table size Less powerful than LR(1) Will not introduce shift-reduce conflicts, because shifts do not use lookaheads May introduce reduce-reduce conflicts, but seldom do so for grammars of programming languages

Constructing LALR(1) Parsing Tables 1. Construct sets of LR(1) items 2. Combine LR(1) sets with sets of items that share the same first part I 4 : I 11 : [L * R, =] [R L, =] [L *R, =] [L id, =] [L * R, $] [R L, $] [L *R, $] [L id, $] [L * R, =/$] [R L, =/$] [L *R, =/$] [L id, =/$] Shorthand for two items in the same set

Example LALR(1) Grammar Unambiguous LR(1) grammar: S L = R R L * R id R L Augment with S S LR(1) and LALR(1) : items (next slides)

I 0 : I 1 : [S S, $] goto(i 0,S)=I 1 [S L=R, $] goto(i 0,L)=I 2 [S R, $] goto(i 0,R)=I 3 [L *R, =/$] goto(i 0,*)=I 4 [L id, =/$] goto(i 0,id)=I 5 [R L, $] goto(i 0,L)=I 2 goto(i 0,=)=I 6 [S S, $] I 6 : I 7 : I 8 : [S L= R, $] goto(i 6,R)=I 9 [R L, $] goto(i 6,L)=I A [L *R, $] goto(i 6,*)=I B [L id, $] goto(i 6,id)=I C [L *R, =/$] [R L, =/$] I 2 : I 3 : I 4 : I 5 : [S L =R, $] goto(i 2,=)=I 6 [R L, $] [S R, $] [L * R, =/$] goto(i 4,R)=I 7 [R L, =/$] goto(i 4,L)=I 8 [L *R, =/$] goto(i 4,*)=I 4 [L id, =/$] goto(i 4,id)=I 5 [L id, =/$] I 9 : I A : I B : I C : I D : [S L=R, $] [R L, $] [L * R, $] goto(i B,R)=I D [R L, $] goto(i B,L)=I A [L *R, $] goto(i B,*)=I B [L id, $] goto(i B,id)=I C [L id, $] [L *R, $]

I 0 : I 1 : [S S, $] goto(i 0,S)=I 1 [S L=R, $] goto(i 0,L)=I 2 [S R, $] goto(i 0,R)=I 3 [L *R, =/$] goto(i 0,*)=I 4 [L id, =/$] goto(i 0,id)=I 5 [R L, $] goto(i 0,L)=I 2 goto(i 0,=)=I 6 [S S, $] I 6 : I 7D : I 8A : [S L= R, $] goto(i 6,R)=I 9 [R L, $] goto(i 6,L)=I 8A [L *R, $] goto(i 6,*)=I 4B [L id, $] goto(i 6,id)=I 5C [L *R, =/$] [R L, =/$] I 2 : [S L =R, $] [R L, $] I 9 : [S L=R, $] I 3 : [S R, $] I 4B : [L * R, =/$] goto(i 4B,R)=I 7 [R L, =/$] goto(i 4B,L)=I 9 [L *R, =/$] goto(i 4B,*)=I 4B [L id, =/$] goto(i 4B,id)=I 5 I 5C : [L id, =/$]

Example LALR(1) Parsing Table id * = $ S L R 0 s5 s4 1 2 3 Grammar: 1. S S 2. S L = R 3. S R 4. L * R 5. L id 6. R L 1 2 3 4 5 6 7 s5 s5 s4 s4 s6 r5 r4 acc r6 r3 r5 r4 9 7 9 8 8 r2 9 r6 r6

LL, SLR, LR, LALR Summary LL parse tables computed using FIRST/FOLLOW Nonterminals terminals productions LR parsing tables computed using closure/goto LR states terminals shift/reduce actions LR states nonterminals goto state transitions A grammar is LL(1) if its LL(1) parse table has no conflicts SLR if its SLR parse table has no conflicts LALR(1) if its LALR(1) parse table has no conflicts LR(1) if its LR(1) parse table has no conflicts

LL, SLR, LR, LALR Grammars LR(1) LALR(1) LL(1) SLR LR(0)

Compaction of LR Parsing Tables In programming languages, hundred of states and tens of thousands of table entries Several table rows might be equal: Share rows using pointers Tables might be sparse: Use adjacency representation

Dealing with Ambiguous Grammars 1. S E 2. E E + E 3. E id id + $ s2 s2 s3 r3 s3/r2 acc Shift/reduce conflict: action[4,+] = shift 3 action[4,+] = reduce E E + E 0 1 2 3 4 r3 r2 E 1 4 $ 0 stack $ 0 E 1 + 3 E 4 input id+id+id$ +id$ When shifting on +: yields right associativity id+(id+id) When reducing on +: yields left associativity (id+id)+id

Using Associativity and Precedence to Resolve Conflicts Left-associative operators: reduce Right-associative operators: shift Operator of higher precedence on stack: reduce Operator of lower precedence on stack: shift S E E E + E E E * E E id $ 0 stack $ 0 E 1 * 3 E 5 input id*id+id$ +id$ reduce E E * E

Error Detection in LR Parsing Canonical LR parser uses full LR(1) parse tables and will never make a single reduction before recognizing the error when a syntax error occurs on the input SLR and LALR may still reduce when a syntax error occurs on the input, but will never shift the erroneous input symbol

Error Recovery in LR Parsing Panic mode Pop until state with a goto on a nonterminal A is found, (where A represents a major programming construct) Push A and push goto state Discard input symbols until one is found in the FOLLOW set of A

Error Recovery in LR Parsing Phrase-level recovery Implement error routines for every error entry in table Error productions Pop until state has error production, push lefthand side Discard input until symbol is encountered that allows parsing to continue