Bottom-up parsing
From Freepedia
- See also: top-down parsing
In linguistics, an example of bottom-up parsing would be analyzing a sentence by identifying words first, and then using properties of the words to infer grammatical relations and phrase structures to build a parse tree of the complete sentence.
In programming language compilers, bottom-up parsing is a parsing method that works by identifying terminal symbols first, and combines them successively to produce nonterminals. The productions of the parser can be used to build a parse tree of a program written in human-readable source code that can be compiled to assembly language or pseudocode.
Different computer languages require different parsing techniques, although it is not uncommon to use a parsing technique that is more powerful than that actually required.
It is common for bottom-up parsers to take the form of general parsing engines, that can either parse or generate a parser for a specific programming language given a specification of its grammar. Perhaps the most well known generalized parser generator is YACC.
The common classes of bottom-up parsing are:
- LR parser
- LR(0) - No lookahead symbol
- SLR(1) - Simple with one lookahead symbol
- LALR(1) - Lookahead bottom up, not as powerful as full LR(1) but simpler to implement. YACC uses this language.
- LR(1) - Most general language, but most complex to implement.
- LR(n) - where n is an integer>=0 indicates an LR parser with n lookahead symbols; while languages can be designed that require more than 1 lookahead practical languages try to avoid this because increasing n can threoretically require exponentially more code and data space (the in practice this may not be as bad).
- Precedence parser
The Parser performs one of two actions (beside accept). These are "Shift" and "Reduce".
- Shift means moving a symbol from the input to the stack
- Reduce means matching a set of symbols in the stack for a more general symbol
For example see figure 1
Take the language:
S --> AB
A --> a
B --> b
And the input:
ab
Then the bottom up parsing is:
Stack Input
----------+ +-+-+------
| |a|b|
----------+ +-+-+------
Shift a
--------+-+ +-+--------
|a| |b|
--------+-+ +-+--------
Reduce a (A --> a)
--------+-+ +-+--------
|A| |b|
--------+-+ +-+--------
Shift b
------+-+-+ +----------
|A|b| |
------+-+-+ +----------
Reduce b (B --> b)
------+-+-+ +----------
|A|B| |
------+-+-+ +----------
Reduce AB (S --> AB)
--------+-+ +----------
|S| |
--------+-+ +----------
Accept



