-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathreport.tex
305 lines (249 loc) · 16.1 KB
/
report.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
% !TEX encoding = UTF-8 Unicode
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Short Sectioned Assignment
% LaTeX Template
% Version 1.0 (5/5/12)
%
% This template has been downloaded from:
% http://www.LaTeXTemplates.com
%
% Original author:
% Frits Wenneker (http://www.howtotex.com)
%
% License:
% CC BY-NC-SA 3.0 (http://creativecommons.org/licenses/by-nc-sa/3.0/)
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%----------------------------------------------------------------------------------------
% PACKAGES AND OTHER DOCUMENT CONFIGURATIONS
%----------------------------------------------------------------------------------------
\documentclass[paper=a4, fontsize=11pt]{scrartcl} % A4 paper and 11pt font size
\usepackage{dirtree}
\usepackage{listings}
\usepackage[T1]{fontenc} % Use 8-bit encoding that has 256 glyphs
\usepackage{fourier} % Use the Adobe Utopia font for the document - comment this line to return to the LaTeX default
\usepackage[english, dutch]{babel} % English language/hyphenation
\usepackage{amsmath,amsfonts,amsthm} % Math packages
\usepackage{lipsum} % Used for inserting dummy 'Lorem ipsum' text into the template
\usepackage{graphicx}
\usepackage{sectsty} % Allows customizing section commands
\allsectionsfont{\centering \normalfont\scshape} % Make all sections centered, the default font and small caps
\usepackage{fancyhdr} % Custom headers and footers
\pagestyle{fancyplain} % Makes all pages in the document conform to the custom headers and footers
\fancyhead{} % No page header - if you want one, create it in the same way as the footers below
\fancyfoot[L]{} % Empty left footer
\fancyfoot[C]{} % Empty center footer
\fancyfoot[R]{\thepage} % Page numbering for right footer
\renewcommand{\headrulewidth}{0pt} % Remove header underlines
\renewcommand{\footrulewidth}{0pt} % Remove footer underlines
\setlength{\headheight}{13.6pt} % Customize the height of the header
\usepackage{fancyvrb}
\DefineVerbatimEnvironment{code}{Verbatim}{fontsize=\small}
\DefineVerbatimEnvironment{example}{Verbatim}{fontsize=\small}
\textheight = 720px
\numberwithin{equation}{section} % Number equations within sections (i.e. 1.1, 1.2, 2.1, 2.2 instead of 1, 2, 3, 4)
\numberwithin{figure}{section} % Number figures within sections (i.e. 1.1, 1.2, 2.1, 2.2 instead of 1, 2, 3, 4)
\numberwithin{table}{section} % Number tables within sections (i.e. 1.1, 1.2, 2.1, 2.2 instead of 1, 2, 3, 4)
\setlength\parindent{0pt} % Removes all indentation from paragraphs - comment this line for an assignment with lots of text
%----------------------------------------------------------------------------------------
% TITLE SECTION
%----------------------------------------------------------------------------------------
\newcommand{\horrule}[1]{\rule{\linewidth}{#1}} % Create horizontal rule command with 1 argument of height
\title{
\normalfont \normalsize
\textsc{Universiteit Utrecht, Department of Information and Computing Sciences} \\ [25pt] % Your university, school and/or department name(s)
\horrule{0.5pt} \\[0.4cm] % Thin top horizontal rule
\huge Theory of Programming and Types\\ % The assignment title
\horrule{2pt} \\[0.5cm] % Thick bottom horizontal rule
}
\author{Mathijs Baaijens, 3542068 \\ Nico Naus, 3472353} % Your name
\begin{document}
\maketitle % Print the title
%----------------------------------------------------------------------------------------
\section{abstract}
We describe an extension to the type-correct, stack-safe, provably correct expression compiler described in the paper \textit{"A type-correct, stack-safe, provably correct expression compiler in Epigram"} by McKinna and Wright. Our extension adds non-mutual recursive \ttfamily let \normalfont bindings to this compiler. We describe the following components of our extension:
\begin{itemize}
\item{Evaluation semantics using a primitive-recursive interpreter}
\item{A compiler to tree-structure code}
\item{A simple stack based interpreter for the code}
\item{A compiler correctness proof}
\end{itemize}
In this paper we assume familiarity with the work presented in the paper by McKinna and Wright.
\section{Introduction}
\textit{"A type-correct, stack-safe, provably correct expression compiler in Epigram"}\cite{Compiler} by McKinna and Wright is a practical paper about utilizing dependently typed programming to prove the correctness of a simple expreession compiler. In the paper the authors describe two semantics: a primitive-recursive interpreter, \textbf{eval}, and a compiler \textbf{compile} to tree-structured code for a simple stack-based abstract machine with interpreter \textbf{exec}. The language consists only of boolean and integer values and plus and if-then-else-statements. We will extend the evaluation semantics and correctness proof described in this paper with non-mutual recursive \ttfamily let \normalfont bindings. For our work we will be using Agda\cite{Agda}, a dependantly type programming language similar to the Epigram language used in the paper by McKinna and Wright.
\section{Typed stacks and references}
Both the \textbf{eval} and the \textbf{exec \& compile} evaluation mechanism use typed stacks. Simple type expressions are encoded in the \ttfamily TyExp \normalfont datatype. The value datatype encodes values on the stack and is indexed by the type of the value. \\
\\
\ttfamily
data TyExp : Set where\\
\hspace*{5mm}TyNat : TyExp\\
\hspace*{5mm}TyBool : TyExp\\
data Val : TyExp $\rightarrow$ Set where\\
\hspace*{5mm}nat : $\mathbb{N}$ $\rightarrow$ Val TyNat\\
\hspace*{5mm}bool : Bool $\rightarrow$ Val TyBool\\
\\
\normalfont
The definition of tuples is straightforward and given here for clarity \\
\ttfamily
data \_x\_ (A B : Set) : Set where\\
\hspace*{5mm}<\_,\_> : A $\rightarrow$ B $\rightarrow$ A x B\\
fst : {A B : Set} $\rightarrow$ A x B $\rightarrow$ A\\
fst < x , y > = x\\
snd : {A B : Set} $\rightarrow$ A x B $\rightarrow$ B\\
snd < x , y > = y\\
\normalfont
A context is a list of type expressions paired with a boolean indicating whether the stack value is bound to a \ttfamily let \normalfont statement. This boolean will become important in section 5 when converting between contexts used in the \textbf{eval} semantics and the \textbf{exec \& compile} semantics. \\
\ttfamily
\\
$\Gamma$ = List (Bool x TyExp)\\
\normalfont
A stack is a list of values. Elements can only appended at the top of a stack list. The stack datatype is indexed by a context.\\
\ttfamily
\\
data Stack : $\Gamma$ $\rightarrow$ Set where\\
\hspace*{5mm}empty : Stack []\\
\hspace*{5mm}\_$\rhd$\_ : $\forall$ {b t s} $\rightarrow$ (v:Val t) $\rightarrow$ (xs : Stack s) $\rightarrow$ Stack (<b,t> :: s)\\
\normalfont
The \ttfamily Ref \normalfont datatype defines references to values on a stack. The datatype is indexed by the context of the stack the reference refers to and the type of the value the reference refers to.\\
\ttfamily
\\
data Ref : $\Gamma$ $\rightarrow$ TyExp $\rightarrow$ Set where\\
\hspace*{5mm}Top : $\forall$ {G u} $\rightarrow$ Ref (u :: G) (snd u)\\
\hspace*{5mm}Pop : $\forall$ {G u v} $\rightarrow$ Ref G u $\rightarrow$ Ref (v :: G) u\\
\normalfont
Now we define a function to lookup values in a stack. The reference context is equal to the context of the stack, making it impossible to pass references pointing to invalid locations (or locations storing the wrong type of value). The lookup function therefore always returns a value.\\
\ttfamily
slookup : $\forall$ {S t} $\rightarrow$ Stack S $\rightarrow$ Ref S t $\rightarrow$ Val t\\
slookup (v $\rhd$ xs) Top = v\\
slookup (v $\rhd$ xs) (Pop b$_1$) = slookup xs b$_1$\\
\normalfont
\section{The First Semantics : eval}
We start with defining expressions. The expression datatype is indexed by the following arguments:
\begin{itemize}
\item The type of the result of the expression.
\item A context that defines the layout of the stack the required by \ttfamily var \normalfont expressions to look up values.
\item A boolean value indicating whether the result of this expression will be pushed on the stack as a value bound by a let expression.
\end{itemize}
\ttfamily
data Exp : TyExp $\rightarrow$ $\Gamma$ $\rightarrow$ Bool $\rightarrow$ Set where\\
\hspace*{5mm}var : $\forall$ {ctx t b} $\rightarrow$ Ref ctx t $\rightarrow$ Exp t ctx b\\
\hspace*{5mm}let$_1$ : $\forall$ {ctx t$_1$ t$_2$ b} $\rightarrow$ Exp t$_1$ ctx true $\rightarrow$ Exp t$_2$ (<true,t$_1$> ::ctx) b $\rightarrow$ Exp t$_2$ ctx b\\
\normalfont
The \ttfamily var \normalfont constructor takes a reference to a value on the stack. The \ttfamily let \normalfont constructor takes two expressions. The second expression can refer to the result of the first expression using the \ttfamily var \normalfont constructor. The context of the second expression is extended with a value of the return type of the first expression.\\
Now we can write the evaluation function. The evaluation functions takes a expression with result type t and produces a value of type t. The evalation function also takes as argument a stack of all values bound in preceding \ttfamily let \normalfont statements, we call this the evaluation stack. \\
\ttfamily
\\eval : $\forall$ {t$_1$ ctx b} $\rightarrow$ (e : Exp t$_1$ ctx b) $\rightarrow$ Stack ctx $\rightarrow$ Val t$_1$\\
\normalfont
\\Evaluation a \ttfamily var \normalfont expression is straightforward, we simply look up the reference in the stack and return it.\\
\ttfamily
\\eval (var x) env = slookup env x\\
\normalfont
\\When we evaluate a let-binding, we first evaluate \ttfamily $e_1$\normalfont, and push it into the environment. We use this updated environment to evaluate \ttfamily$e_2$\normalfont.\\
\normalfont
\ttfamily
\\eval (let$_1$ e$_1$ e$_2$) env = eval e$_2$ ((eval e$_1$ env) $\rhd$ env)\\
\normalfont
\section{The Second Semantics : compile \& exec}
The semantics we are about to define describe the actual compiler we want to proof to be correct. The execution semantics operate on a stack we call the execution stack. It is important to note that the execution stack does not contain the same values as the evaluation stack described earlier. The values on the evaluation stack are a subset of the values on the execution stack.
\subsection{Specifying intermediate code}
The intermediate code datatype is indexed by two contexts. The first context describes the execution stack before executing the code. The second context describes the execution stack after executing the code. \\
\\
\ttfamily
data Code : $\Gamma$ $\rightarrow$ $\Gamma$ $\rightarrow$ Set where\\
\hspace*{5mm}LDS : $\forall$ {S t b} $\rightarrow$ (f : Ref S t) $\rightarrow$ Code S (< b , t > :: S)\\
\hspace*{5mm}POP : $\forall$ {b S t$_1$ t$_2$} $\rightarrow$ Code (<b,t$_1$> :: (<true,t$_2$> :: S)) (<b,t$_1$> :: S)\\
\normalfont
\\The \ttfamily LDS \normalfont instruction pushes a value on the stack referred to by a reference into the execution stack. The \ttfamily POP \normalfont instruction pops the value below the top value of the stack. This might sound odd, but this is exactly the behaviour we need to clear the value introduced by a let statement from the stack after execution.
\subsection{Implementing an Interpreter for Intermediate Code}
The execution function is straightforward\\
\\
\ttfamily
exec : {S S' : $\Gamma$} $\rightarrow$ Code S S' $\rightarrow$ Stack S $\rightarrow$ Stack S'\\
exec (LDS f) s = (slookup s f) $\rhd$ s\\
exec POP (v $\rhd$ (v$_1$ $\rhd$ s)) = v $\rhd$ s\\
\normalfont
\subsection{Converting between contexts}
We define a function to convert an execution context to an evaluation context.\\
\ttfamily
\\
trimContext : $\Gamma$ $\rightarrow$ $\Gamma$\\
trimContext [] = []\\
trimContext (< true , x$_1$ > :: s) = < true , x$_1$ > :: trimContext s\\
trimContext (< false , x$_1$ > :: s) = trimContext s\\
\normalfont
Now we can define a function that converts evaluation stack references to execution stack references. The \ttfamily trimContext \normalfont function is used to establish a relation between the context of the stacks. Because an evaluation stack is always a subset of an execution stack the conversion is always possible.\\
\ttfamily
\\
convertRef : $\forall$ {S t} $\rightarrow$ Ref (trimContext S) t $\rightarrow$ Ref S t\\
convertRef {[]} ()\\
convertRef {< true , x$_1$ > :: S} Top = Top\\
convertRef {< true , x$_1$ > :: S} (Pop s) = Pop (convertRef s)\\
convertRef {< false , x$_1$ > :: S} s = Pop (convertRef s)\\
\normalfont
\subsection{Implementing the compiler to intermediate code}
Implementing compilation is straightforward. Just as for \ttfamily convertRef \normalfont the relation between the evaluation stack and the execution stack is expressed using the \ttfamily trimContext \normalfont function. \\
\\
\ttfamily
compile : $\forall$ {b S t} $\rightarrow$ (e : Exp t (trimContext S) b) $\rightarrow$ Code S (<b,t> :: S)\\
compile (var x) = LDS (convertRef x)\\
compile (let$_1$ e e$_1$) = compile e ++$_1$ (compile e$_1$ ++$_1$ POP)\\
\normalfont
\section{Compiler Correctness}
To define the correctness proof we need the equivalent of \ttfamily trimContext \normalfont for stacks. The \ttfamily trimStack \normalfont function removes all values from a stack that are not bound by `let` expressions.\\
\ttfamily
trimStack : $\forall$ {S} $\rightarrow$ Stack S $\rightarrow$ Stack (trimContext S)\\
trimStack {[]} x = empty\\
trimStack {< true , x$_1$ > :: S} (v $\rhd$ x$_2$) = v $\rhd$ (trimStack x$_2$)\\
trimStack {< false , x$_1$ > :: S} (v $\rhd$ x$_2$) = trimStack x$_2$\\
\normalfont
We start by proving the following equivalence.\\
\ttfamily
lemma : $\forall$ {S t} $\rightarrow$ (x : Ref (trimContext S) t) $\rightarrow$ (s : Stack S)\\
\hspace*{30mm}$\rightarrow$ (slookup (trimStack s) x) $\equiv$ (slookup s (convertRef x))\\
lemma {[]} () s\\
lemma {< true , t > :: S} Top (v $\rhd$ s) = refl\\
lemma {< true , x$_1$ > :: S} (Pop e) (v $\rhd$ s) = lemma e s\\
lemma {< false , x$_1$ > :: S} e (v $\rhd$ s) = lemma e s\\
\normalfont
Now we define the type of the correctness proof of our compiler. Once again the \ttfamily trimContext \normalfont method is used to express the relation between the execution stack context and the evaluation stack context. The trimStack function is used to convert the execution stack to an evaluation stack.\\
\ttfamily
\\
correct : $\forall$ {b S t} $\rightarrow$ (e : Exp t (trimContext S) b) $\rightarrow$ (s : Stack S)\\\hspace*{36mm} $\rightarrow$ ((eval e (trimStack s)) $\rhd$ s) $\equiv$ (exec (compile e) s)\\
\normalfont
\\Correctness for var expression is trivial to proof using the lemma defined earlier\\
\ttfamily
\\
correct (var x) s with lemma x s\\
... | p with slookup (trimStack s) x | slookup s (convertRef x) \\
correct (var x) s | refl | .l | l = refl\\
\normalfont
\\Correctness for let expression is proved by induction on e and e$_1$ sequentially. In the inductive step on e$_1$ the resulting value of evaluating e is appended to the stack\\
\ttfamily
\\
correct (let$_1$ e e$_1$) s with correct e s\\
... | p1 with exec (compile e) s | eval e (trimStack s)\\
correct (let$_1$ e e$_1$) s | refl | .(p3 $\rhd$ s) | p3 with correct e$_1$ (\_$\rhd$\_ {true} p3 s)\\
... | p4 with exec (compile e$_1$) (\_$\rhd$\_ {true} p3 s) | eval e$_1$ (p3 $\rhd$ trimStack s)\\
correct (let$_1$ e e$_1$) s | refl | .(p3 $\rhd$ s) | p3 | refl | .(p6 $\rhd$ (p3 $\rhd$ s)) | p6 = refl\\
\normalfont
\section{Conclusion}
We have demonstrated how the dependantly typed programming language Agda can be used to prove the correctness of a compiler supporting \ttfamily let \normalfont statements.
\section{Related work} %required by assignment description
A Certified Type-Preserving Compiler from Lambda Calculus to
Assembly Language \cite{Chlipala}. Here the author presents a certified compiler for a language similar to ours, with a machine-checked correctness proof written in Coq.
\begin{thebibliography}{9}
\bibitem{Compiler}
James McKinna and Joel Wright,
\emph{A type-correct, stack-safe, provably correct expression compiler in Epigram}.
J. Functional Programming.
\bibitem{Agda}
Ulf Norell and James Chapman ,
\emph{Dependently Typed Programming in Agda}.
http://www.cse.chalmers.se/~ulfn/papers/afp08/tutorial.pdf.
\bibitem{Chlipala}
Adam Chlipala,
\emph{A Certified Type-Preserving Compiler from Lambda Calculus to
Assembly Language}.
Proceedings PLDI '07, p54-65, New York, 2007.
\end{thebibliography}
\end{document}