Compiler Design

LR Parsing Part 1: Viable Prefixes and Handle in LR Parsing

Well lets start from the basic of Bottom up Parsing

Consider the grammar

  • $S \to XX$
  • $X \to aX \mid b$

Now, consider a string in $L(S)$ say aabb. We can parse it as follows by left most derivation – replacing the left most non-terminal in each step, or right most derivation – replacing the rightmost non-terminal in each step.

$S \to XX$
$\quad \to aX X$
$\quad \to aaXX$
$\quad \to aabX$
$\quad \to aabb$
$S \to XX$
$\quad \to Xb$
$\quad \to aXb$
$\quad \to aaXb$
$\quad \to aabb$

Here, the rightmost-derivation is used in bottom-up parsing. We will soon come back to it.

  • Any string derivable from the start symbol is a sentential form — it becomes a sentence if it contains only terminals
  • A sentential form that occurs in the leftmost derivation of some sentence is called left-sentential form
  • A sentential form that occurs in the rightmost derivation of some sentence is called right-sentential form

Consider the string aabb again. We will see a method to parse this:

  1. Scan the input from left to right and see if any substring matches the RHS of any production. If so, replace that substring by the LHS of the production.

So, for “aabb” we do as follows

  1. ‘a’ “abb” – No RHS matches ‘a’
  2. ‘aa’ “bb” – No RHS matches ‘a’ or ‘aa’
  3. ‘aab’ “b” – RHS of $X \to b$ matches ‘b’ (first b from left) and so we write as
  4. aaXb
  5. ‘aaX’ “b” – RHS of $X \to aX$ matches “aX” and so we write as
  6. aXb – Again RHS of $X \to aX$ matches “aX” and we get
  7. Xb – RHS of $X \to b$ matches “b” and we get
  8. XX – RHS of $S \to XX$ matches XX and we get
  9. S – the start symbol.

Now what we did here is nothing but a bottom-up parsing. Bottom-up because we started from the string and not from the grammar. Here, we applied a sequence of reductions, which are as follows:

  1. $aabb \to aaXb \to aXb \to Xb \to XX \to S$

If we go up and see the Rightmost derivation of the string “aabb”, what we got is the same but in REVERSE order. i.e., our bottom up parsing is doing reverse of the RIGHTMOST derivation. So, we can call it an $LR$ parser – $L$ for scanning the input from Left side and $R$ for doing a Rightmost derivation.

In our parsing we substituted the RHS of a production at each step. This substituted “substring” is called a HANDLE and are shown in BOLD below.

  1. aabb $\to$ aaXb $\to$ aXb $\to$ Xb $\to$ XX $\to$ S

Formally a handle is defined as (Greek letters used to denote a string of terminals and non-terminals)

A handle of a right sentential form ‘$\gamma$’  $(\gamma = \alpha \delta \beta$) is a production $E \to \delta$ and a position in $\gamma$ where $\delta$ can be found and substituted by $E$ to get the previous step in the right most derivation of $\gamma$ — previous and not “next” because we are doing rightmost derivation in REVERSE. Handle can be given as a production or just the RHS of a production.

The handle is not necessarily starting from the left most position as clear from the above example. There is importance to the input string which occurs to the left of the handle. For example for the handles of “aabb”, we can have the following set of prefixes

aabb $\{a, aa, aab\}$
aaXb $\{a, aa, aaX\}$
aXb $\{a, aX\}$
Xb $\{X, Xb\}$
XX $\{X, XX\}$

These set of prefixes are called Viable Prefixes. Formally

Viable prefixes are the prefixes of right sentential forms that do not extend beyond the end of its handle.

i.e., a viable prefix either has no handle or just one possible handle on the extreme RIGHT which can be reduced.

We will see later that viable prefixes can also be defined as the set of prefixes of the right-sentential form that can appear on the stack of a shift-reduce parser. Also, the set of all viable prefixes of the right sentential forms of a grammar is a REGULAR LANGUAGE. i.e., viable prefixes can be recognized by using a FINITE AUTOMATA. Using this FINITE AUTOMATA and a stack we get the power of a Push Down Automata and that is how we can parse context-free languages.

In the next part we will see how to construct the above FINITE AUTOMATA and how our stack works for different LR parsers.

Reference: Bottom Up Parsing

 

Compiler Design Preparation Resources for GATE CSE

Arjun Suresh

View Comments

Share
Published by
Arjun Suresh

Recent Posts

GATE CSE 2022 Admissions, Results and Placement Responses

This page shows all details regarding GATE CSE 2022 Admissions including results, admission offers and…

2 years ago

GATE CSE 2021 Admission Responses

Source Add your Response Rank Predictor for GATE CSE 2021 Result Responses: GATE CSE 2021…

3 years ago

GATE CSE Books – More Options

Best Books for GATE CSE with Relevant Chapters to Read  Heads Up! These GATE CSE…

3 years ago

ISI PCB Previous Year Papers with Solution

Indian Statistical Institute(ISI) offers M Tech in Computer Science with the Admission Test Codes MMA/PCA…

3 years ago

ISI JRF Previous Year Papers with Solution

Indian Statistical Institute(ISI) offers Junior Research Fellowships (JRF) in Computer Science, Statistics, Mathematics, Quantitative Economics,…

3 years ago

ISI DCG Previous Year Papers with Solution

 Indian Statistical Institute(ISI) conduct the admission test for Postgraduate Diploma in Computer Applications(PGDCA) with the…

3 years ago