Bottom-up parsing

From Freepedia


Bottom-up parsing is a strategy for analyzing unknown data relationships that attempts to identify the most fundamental units first, and then to infer higher-order structures from them. It occurs in the anaylsis of both natural languages and computer languages.
See also: top-down parsing

In linguistics, an example of bottom-up parsing would be analyzing a sentence by identifying words first, and then using properties of the words to infer grammatical relations and phrase structures to build a parse tree of the complete sentence.


In programming language compilers, bottom-up parsing is a parsing method that works by identifying terminal symbols first, and combines them successively to produce nonterminals. The productions of the parser can be used to build a parse tree of a program written in human-readable source code that can be compiled to assembly language or pseudocode.

Different computer languages require different parsing techniques, although it is not uncommon to use a parsing technique that is more powerful than that actually required.

It is common for bottom-up parsers to take the form of general parsing engines, that can either parse or generate a parser for a specific programming language given a specification of its grammar. Perhaps the most well known generalized parser generator is YACC.

The common classes of bottom-up parsing are:

  • LR parser
    • LR(0) - No lookahead symbol
    • SLR(1) - Simple with one lookahead symbol
    • LALR(1) - Lookahead bottom up, not as powerful as full LR(1) but simpler to implement. YACC uses this language.
    • LR(1) - Most general language, but most complex to implement.
    • LR(n) - where n is an integer>=0 indicates an LR parser with n lookahead symbols; while languages can be designed that require more than 1 lookahead practical languages try to avoid this because increasing n can threoretically require exponentially more code and data space (the in practice this may not be as bad).
  • Precedence parser

The Parser performs one of two actions (beside accept). These are "Shift" and "Reduce".

  • Shift means moving a symbol from the input to the stack
  • Reduce means matching a set of symbols in the stack for a more general symbol

For example see figure 1

Figure 1.
Take the language:
S --> AB
A --> a
B --> b

And the input:
ab

Then the bottom up parsing is:

   Stack           Input
----------+     +-+-+------
          |     |a|b|
----------+     +-+-+------
          Shift a
--------+-+     +-+--------
        |a|     |b|
--------+-+     +-+--------
     Reduce a (A --> a)
--------+-+     +-+--------
        |A|     |b|
--------+-+     +-+--------
          Shift b
------+-+-+     +----------
      |A|b|     |
------+-+-+     +----------
     Reduce b (B --> b)
------+-+-+     +----------
      |A|B|     |
------+-+-+     +----------
    Reduce AB (S --> AB)
--------+-+     +----------
        |S|     |
--------+-+     +----------
          Accept


Views
Personal tools
Similar Links