Skip to content

Research ~ Compiler Construction

Calum J. Eadie edited this page Jan 8, 2013 · 11 revisions

Overview

  • Lexical analysis: character stream -> token stream
  • Syntax analysis: token stream -> abstract syntax tree
  • Translation / Linearisation: abstract syntax tree -> linear intermediate code
  • Target code generation: intermediate code -> text form target machine code

Simple language syntax

<expr> ::= <number> | <var> | <expr> <binop> <expr> | <monop> <expr>
    | <fnname>(<expr>*) | <expr> ? <expr> : <expr>
<cmd> ::= <var> = <expr>; | if (<expr>) <cmd> else <cmd> | while (<expr>) <cmd> | return <expr>;
    | { <decl>* <cmd>* }
<decl> ::= int <var> = <expr>; | int <fnname>(int <var> ... int <var>) <cmd>
<program> ::= <decl>*

Interpretation

syntax-tree form - this is a natural and simple form to interpret (the next section gives an example for our language). It is noteworthy (linking ideas between courses) to note that op- erational semantics are merely a form of interpreter. Syntax tree interpreters are commonly used for PHP or Python.

Data structures for parse trees

E −→ E + E | E - E | E * E | E / E | ( E ) | n

datatype E = Add of E * E | Sub of E * E | Mult of E * E | Div of E * E | Paren of E | Num of int;

Translation to intermediate code

Abstract syntax tree -> linear sequence of statements or internal representation of a flowchart

Concerns:

  • scope and allocation of variables
  • type of all expressions
  • selection of overloaded expressions
  • generating intermediate code
datatype Expr = Num of int | Var of string | Neg of Expr
| Not of Expr
| Add of Expr * Expr | Sub of Expr * Expr | Mul of Expr * Expr | Div of Expr * Expr | Eq of Expr * Expr | Ne of Expr * Expr (* etc *) | And of Expr * Expr | Or of Expr * Expr (* for &&/|| *) | Apply of string * (Expr list)
| Cond of Expr * Expr * Expr;
datatype Cmd = Assign of string * Expr | If3 of Expr * Cmd * Cmd | While of Expr * Cmd | Block of Decl list * Cmd list | Seq of Cmd * Cmd (* redundant special case of Block *) (* (useful for exercises) *)
| Return of Expr;
datatype Decl = Vardef of string * Expr | Fndef of string * (string list) * Cmd;
type Prog = Decl list; (* shorthand for ’program’ *)

Translate using routines

  • void trexp(Expr e)
  • void trcmd(Cmd c)
  • void trdecl(Decl d)

which have side effect of emitting JVM code

Translating expressions - trexp

fun trexp(Num(k)) = gen2(OP_iconst, k);
...
  | trexp(Mul(x,y)) = ( trexp(x); trexp(y); gen1(OP_imul) )
...

Translation of declarations and commands

fun trcmd(Assign(s,e)) = (trexp(e); trname(OP_istore,s))
  | trcmd(Return e) = (trexp(e); gen1(OP_ireturn))
  | trcmd(Seq(c,c)) = (trcmd(c); trcmd(c)) | trcmd(If3(e,c,c’’)) = ...
Clone this wiki locally