Skip to content
Henrik Lievonen edited this page May 31, 2014 · 2 revisions

This document is incomplete! Help us to expand it.

This document is for documentating compiler functionality and some guidelines for future development.

Compilation steps

Code compilation is done in ... steps.

1. Tokenizing

The lexer.js is used to tokenize the source code. Tokenization is done with regular expression for detecting keywords and more complicated patterns. Lexers returns Token objects which have variables type, line and val.

Token types

Following token types are supported and checked in this order:

  1. End Of Source (eos) Emited in case of end of source code

  2. Comment (comment) Comments are string literals begining with single quote and ending at the end of line

  3. Number (number) Numbers are string literals possibly starting with a minus sign (-), then containing zero or more desimal figits, an optional dot (.) and then one or more desimal digits

  4. String (string) A string is as short as possible text literal starting and ending with double quote ("). For string literals val doesn't contain quotes.

  5. Operator An operator is one of the following:

    • < (lt)
    • <= (lte)
    • > (gt)
    • >= (gte)
    • = (eq)
    • <> (neq)
    • and (and)
    • or (or)
    • xor (xor)
    • + (plus)
    • - (minus)
    • * (mul)
    • / (div)
    • mod (mod)
    • & (concat)
  6. Comma (comma) Just a regular comma (,).

  7. Parenthesis Either ((lparen) or )(rparen).

  8. To (to) A string literal equals to TO.

...TODO! Complete the list

2. Parsing

Parser reads tokens via lexer and creates an Abstract Syntax Tree (ast) of tokens.

3. Typechecker

Checks types of every expression.

4. Atomicchecker

Checks for every expression if it is atomic i.e. if it doesn't call DRAWSCREEN in any of its branches.

5. Compiler

Compiles ast into an asm.js compatible code.