Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 1.09 KB

03-parsing.md

File metadata and controls

15 lines (8 loc) · 1.09 KB

Parser - Lexer combination

Parsing can be separated into two sub processes: lexical analysis and syntax analysis.

  • Lexical Analysis: Explain how input is broken into tokens by the lexer, which removes irrelevant characters like white spaces and line breaks.

  • Syntax Analysis: Describe how the parser constructs the parse tree by applying language syntax rules to the tokens generated by the lexer.

Lexer and Parser: Highlight the roles of the lexer in tokenizing input and the parser in constructing the parse tree.


The parsing process is iterative. The parser will usually ask the lexer for a new token and try to match the token with one of the syntax rules. If a rule is matched, a node corresponding to the token will be added to the parse tree and the parser will ask for another token.

If no rule matches, the parser will store the token internally, and keep asking for tokens until a rule matching all the internally stored tokens is found. If no rule is found then the parser will raise an exception. This means the document was not valid and contained syntax errors.