ch1.tex

\chapter{Functional programming with Types}

\section{Lambda Calculus}

Understanding lambda calculus is important in both the context of type inference and in the context of (functional) programming.

In this thesis, we will use lambda calculus as described in  \emph{Lambda Calculi with Types} \cite{barendregt1992lambda}.

\begin{defn}[Lambda Calculus Grammar]
$$variable ::= v\ |\ variable'$$
$$\lambda\mbox{-}term ::= variable\ |\ (\lambda\mbox{-}term\ \lambda\mbox{-}term)\ |\ (\lambda\ variable\ . \lambda\mbox{-}term )$$
\end{defn}

In the $\lambda\mbox{-}term$ grammar rule, the second case is called \emph{application} and the third is called \emph{abstraction}. These two constructs correspond to passing and declaring an argument, respectively.

\begin{defn}[Bound Variable and Free Variable]
    \label{defn:boundFree}
    In the abstraction we call the variable $x$ (its occurrences in the abstraction) \textbf{bound} by the abstraction. If it is not bound we call it \textbf{free}.
\end{defn}

It is a simple language in which we can formally describe any computational problem.

The whole idea of evaluation of a lambda calculus program is using a simple rewriting rule, $\beta$ reduction:

\begin{defn}[$\beta$ reduction]
    $$(\lambda x . e) e' = e [x := e']$$ \cite{barendregt1992lambda}
\end{defn}

There is one other rewriting rule, \emph{$\alpha$ conversion}, which is used to resolve name collisions, but we will not consider it as we will consider each bound variable uniquely named.

\subsection{Subexpression}

\begin{defn}[Subexpression]
Given a well-defined expression $e$, a subexpression is an expression $e'$ such that: $e = \lambda x . e'$, $e = e' e''$, $e = e'' e'$, or $e = e'$, and iteratively from that.
\end{defn}

This definition will extend intuitively to all other syntactic systems.

\paragraph{Writing Conventions}

Lambda calculi usually use the writing conventions of applications being left-associative, abstractions being right-associative, and the $\rightarrow$ operator (which will be used in typed lambda calculi to denote functions) being right-associative as well.

\section{Simply-Typed Lambda Calculus}

Simply-typed lambda calculus (also $\lambda\rightarrow$), described by Church, is one of the first lambda calculi with a type system. Typing of lambda terms helps in checking the validity of the program, protecting the user from writing programs that make little sense like for example, adding amperes to volts \cite{barendregt1992lambda} and also it makes it easier to think about the program in the bigger picture giving us information about the usage of the typed entities in some code, typing $f : \alpha \rightarrow \alpha \rightarrow \alpha$, for example, tells us that $f$ is a function taking two arguments of some type and returning a value of the same type. From that alone, we can assume that applying it to two parameters of the same type should be well-defined.

Syntactically it is not any different from the lambda calculus, considering the \emph{\`a la} Curry variant. \cite{barendregt1992lambda}

\subsection{Types and Typing}

We will formally define what we consider a type, first we consider a \textbf{type variable} similarly to the variables of the lambda calculus:

\begin{defn}[Type Variable]
    By a type variable, we will mean a symbol defined by the following grammar rule:
    $$type\_variable ::= tv\ |\ type\_variable'$$
    but often, we will use $\alpha, \beta$ as aliases to arbitrary type variables
\end{defn}

\begin{defn}[Type in $\lambda\rightarrow$]
    \label{defn:typeSTLC}
    For types $\tau', \tau''$, a type variable $\alpha$, we define a type $\tau$ in $\lambda\rightarrow$ by the following grammar rule:
    $$\tau ::= \alpha\ |\ \tau' \rightarrow \tau''$$
    where $\tau' \rightarrow \tau''$ is a type of a function taking an argument of the type $\tau'$ and returning a value of the type $\tau''$
\end{defn}

We can also freely add type constants (like \emph{integers} or \emph{booleans}) to the definition.

\begin{defn}[Typing and Assumption in $\lambda\rightarrow$]
    \label{defn:typingSTLC}
    Typing is an assignment of a type to an expression written as follows:
    $$e : \tau$$
    where $e$ is an expression and $\tau$ is a type. If $e$ is a variable $x$ and we infer other types from it, we refer to such typing as an \textbf{assumption} or a \textbf{judgement}.
\end{defn}

Sometimes we will write $\Gamma \vdash e : \tau$, meaning that in the context of $\Gamma$ (set of assumptions) $e$ can get the type $\tau$, more on that in the definition \ref{defn:STLCInference}.

\subsection{Inference Rules}

Simply-typed lambda calculus uses the following rules, we present them to formally define how to determine whether a certain typing is valid:

\begin{defn}[Inference Rules of $\lambda\rightarrow$]
    \label{defn:STLCInference}
    For types $\tau, \sigma$, expressions $e, e'$, a variable $x$ and a set of assumptions $\Gamma$, we define the type system $\lambda\rightarrow$ by the following inference rules:
    $$\boxed{\renewcommand{\arraystretch}{2.2}\begin{array}{cl}
        \infer[\text{(Variable)}]{\Gamma \vdash x : \tau}{x : \tau \in \Gamma} \\
        \infer[(\text{Application})]{\Gamma \vdash e e' : \tau}{\Gamma \vdash e : \sigma \rightarrow \tau \quad \Gamma \vdash e' : \sigma} \\
        \infer[(\text{abstraction})]{\Gamma \vdash \lambda x . e : \sigma \rightarrow \tau}{\Gamma, x : \sigma \vdash e : \tau}
    \end{array}}$$
\end{defn}

We call $\Gamma$ either a basis, a context, or a set of assumptions. These names describe pretty well its meaning. In the simply-typed lambda calculus, it is a set of typings of type variables from which we derive the rest of the typings.

The inference rules describe the step-by-step derivation of types where the bottom one follows the top one.

\subsubsection{Example}
\label{sssec:idExample}
A Step by step example for inferring the type of $id = \lambda x . x$:

\begin{enumerate}
    \item $\{\} \vdash \lambda x . x : ?$
    \item $\{\} \vdash \lambda x . x : \sigma \rightarrow \tau$ (abstraction) \label{enum:abs}
    \item $\{\}, x : \sigma \vdash x : \tau$ (still abstraction; we want to infer $\tau$)
    \item $\{x : \sigma\} \vdash x : \sigma$ (var)
    \item $\{\} \vdash \lambda x . x : \sigma \rightarrow \sigma$ (result of abstraction in the step \ref{enum:abs})
\end{enumerate}

Notice here the context changes during the derivation process and that we can give typing to $\lambda\rightarrow$ terms without any context, and they can have different typings in different contexts. Also, one term can have multiple possible typings.

% TODO: maybe say something about Church vs curry style

\subsection{Type Substitution, Instantiation and Unification}

\begin{defn}[Type Substitution]
    \label{defn:substitution}
    Type substitution is a process that rewrites (simultaneously) certain type variables into different types. Substitutions will be very important for various definitions and, even more importantly, for type unification, which will be the basis for type inference algorithms in the following type systems.

    For types $\pi_1, \dots \pi_n$ and type variables $\alpha_1, \dots \alpha_n$ such that each $\alpha_i$ is distinct, we define substitution $S$ by the following grammar rule:
    $$S ::= [\alpha_1 := \pi_1, \dots \alpha_n := \pi_n]$$
    where $\alpha_i := \pi_i$ denotes that $\pi_i$ replaces $\alpha_i$.
\end{defn}

We will write applying $S$ to a type $\tau$ as $\tau S$ (left associative), we will also allow a simplified definition $[\overline{\alpha} := \overline{\pi}]$ meaning $[\alpha_1 := \pi_1, \dots \alpha_n := \pi_n]$.

\begin{defn}[Type Instance]
    \label{defn:typeInstance}
    We will call a type $\tau$ a type instance of a type $\sigma$ if there is a type substitution $S$ such that $\tau = \sigma S$.
\end{defn}

\begin{defn}[Type Unifier]
    \label{defn:unifier}
    If $\tau_1, \tau_2$ are types and $S$ is a substitution, we say that $S$ unifies $\tau_1$ and $\tau_2$ if $\tau_1 S = \tau_2 S$, then we call $S$ their unifier, and we call $\tau_1$ and $\tau_2$ unifiable. \cite{robinson1965machine} (originally for sets of types)
\end{defn}

\subsection{Principal Typing}

Every typeable term $e$ in the simply-typed lambda calculus has principal type $\tau$ \cite{barendregt1992lambda} which is computable from $e$ and for every other valid typing $e : \sigma$ in the same context there is a substitution $S$ such that $\sigma = \tau S$.

The intuitive idea of principal typing of $e$ is that it is a typing general enough to be substitutable into all other possible typings of $e$ but not general enough to be substitutable into any impossible typings. This is usually simplified into `the most general type of $e$'.

The example in section \ref{sssec:idExample} is a great fit for showing this. $\lambda x . x : \sigma$ would be a possible typing general enough to be instantiated into any valid typing but is not a principal one because substitution $[\sigma := \alpha \rightarrow \alpha \rightarrow \alpha]$ would give use $\alpha \sim \alpha \rightarrow \alpha$, which would yield an incomputable infinite type.

Principal typings will be very important later when we introduce more advanced type systems as the principal type of $e$ is the only information about $e$ we have to consider when we see $e$ again, given it has no free variables.

\subsection{Type Checking, Typeability, and Type Inference in $\lambda\rightarrow$}

Type checking is the process of validating whether a term can have a given type, whereas typeability is a decision problem of whether the term can be given some type.

Type inference is closely related to typeability but has an additional requirement of giving an actual type for the term, and what we usually are interested in is its principal type.

Both type checking and typeability of the simply-typed lambda calculus are decidable, and its type inference is computable \cite{barendregt1992lambda}.

One thing that is a problem for the simply-typed lambda calculus is that it is not Turing-complete. It is strongly normalizing \cite{barendregt1992lambda}, and thus assuming it being Turing-complete, the type inference would have to solve the halting problem. This problem will be answered by the following type systems.

\section{System F and Parametric Polymorphism}

System F, polymorphic lambda calculus, or second-order lambda calculus introduces parametric polymorphism into the lambda calculus.

Here we can reuse the example of $id = \lambda x . x$ which, given the simply-typed lambda calculus, can have many typings ($id : \tau \rightarrow \tau$ for any $\tau$) which in this type system we can state by one typing:

$$id : \forall \alpha . \alpha \rightarrow \alpha$$

We call such typings (with the $\forall$ quantifier) \textbf{polymorphic} (opposed by \textbf{monomorphic}).

The definition of types extends the definition \ref{defn:typeSTLC} for the simply-typed lambda calculus.

\subsection{Types, Type Substitution, and Type Instantiation in the System F }

\begin{defn}[Type in the system F] Types in this type system are defined by the following grammar rule extending the original definition \ref{defn:typeSTLC}:
    $$\tau ::= \alpha\ |\ \tau \rightarrow \tau'\ |\ \forall \alpha . \tau$$
    $\tau, \tau'$ are types and $\alpha$ is a type variable; we can also add a case $\tau ::= \iota$, where $\iota$ is a type constant (for example, \lstinline{int}).
\end{defn}

In the last case ($\forall \alpha . \tau$), we call $\alpha$ (its occurrence of in $\tau$) a \textbf{bound} type variable. If the variable is not bound, we call it \textbf{free}.

\begin{defn}[Type Substiotution and Type Instantiation in the System F]
    \label{ssec:substitutionF}
    Type substitution and type instantiation are defined as in the simply-typed lambda calculus (definitions \ref{defn:substitution} and \ref{defn:typeInstance}), with the exception that the rewriting does not apply to bound type variables.
\end{defn}

\subsection{Inference Rules of the System F}

\begin{defn}[Inference rules of the System F]
    System F uses the rules of the simply-typed lambda calculus (definition \ref{defn:STLCInference}), plus the following two rules:
    $$\boxed{\renewcommand{\arraystretch}{2.2}\begin{array}{c}
        \infer[\text{($\forall$-elimination)}]{\Gamma \vdash e : \sigma [\alpha := \tau]}{\Gamma \vdash e : \forall \alpha . \sigma} \\
        \infer[\text{($\forall$-introduction)}]{\Gamma \vdash e : \forall \alpha . \sigma}{\Gamma \vdash e : \sigma}
    \end{array}}$$
    In $(\text{$\forall$-introduction})$, the type variable $\alpha$ is chosen so it is not free in any any assumptions on which the premise $e : \sigma$ depends.
\end{defn}

In system F the type checking is undecidable, and thus any general type inference algorithm is impossible. The Hindley-Milner type system is a restriction of it that allows for a simple type inference algorithm on which we will base ours.

\section{Hindley-Milner Type System}

The Hindley-Milner type system, we will refer to it simply as HM, is a fairly minimalistic extension of lambda calculus, or at least syntactically. Semantically it could be seen rather as a restriction of the system F that answers the undecidability of it. For that, see the subsection \emph{Type Schemes}.

\begin{defn}[Expression in HM]
    \label{defn:syntaxHM}
    For a variable $x$ and expressions $e', e''$. We define an expression $e$ in HM have the form given by the following grammar rule:
    $$e ::= x\ |\ e' e''\ |\ \lambda x . e'\ |\ \text{let } x = e' \text{ in } e''$$
\end{defn}

We skipped two more cases that are not necessary for the type inference. We will describe them in the definition \ref{defn:ifElseHM}, and later we will define them as regular functions.

It differs from the syntax of lambda calculus (and that of the system F) only in the addition of the third clause, called the \emph{let statement}. The meaning of this clause is the same as of $e' [x := e]$, but it allows $e$ to be polymorphic, which is otherwise for subexpressions forbidden in HM. A very similar construct $(\lambda x . e') e$ (using abstraction instead of \emph{let}) would result in all occurrences of $x$ in $e'$ having the same monomorphic type in this system. We will later describe definitions for typings and inference rules that will result in this behavior.

There are two additional expressions completing the language that have no significance to the type system and thus we will not further consider them in any of the typing rules, and in the inference algorithm. They are defined as follows:

\begin{defn}[If Then Else and Fix in HM]
    \label{defn:ifElseHM}
    Here we give the two cases omitted in definition \ref{defn:syntaxHM}:
    $$e ::= \cdots\ |\ \text{fix } x . e'\ |\ \text{if } e' \text{ then } e'' \text{ else } e'''$$
    where `$\cdots$' denotes that this extends the definition \ref{defn:syntaxHM}.

    The latter clause has the expected meaning, and the former is defined as the least fixed point of $\lambda x . e$ \cite{milner1978theory}. In the context of type inference the two clauses can be seen replaced by functions (operators) with the following typings (type schemes) \cite{damas1982principal}:
    $$\text{fix}: \forall \alpha . (\alpha \rightarrow \alpha) \rightarrow \alpha$$
    $$\text{if}: \forall \alpha . \text{ bool } \rightarrow \alpha \rightarrow \alpha \rightarrow \alpha$$
\end{defn}

\subsection{Types, Types Schemes, and Typing in HM}

Types of the HM type system are restricted in that they cannot include quantification of type variables introduced in the F system. This change is then counterbalanced by the introduction of type schemes, which replace types in typings.

\begin{defn}[Type in HM]
    \label{defn:typeHM}
    Types in HM are similar to those in $\lambda\rightarrow$ (definition \ref{defn:typeSTLC}), they are defined by the following grammar rule:
    $$\tau ::= \alpha\ |\ \iota\ |\ \tau' \rightarrow \tau''$$
    the only difference is it adds a case $\iota$ for a primitive type (type constant).
\end{defn}

\begin{defn}[Type Scheme in HM]
    \label{defn:schemeHM}
    Type schemes are defined similarly to types in the system F. They are introduced, so the type $\forall$ quantification, if there is any (in a typing), is always at the very front of the typing (in the system F, it can be anywhere inside the type).

    For a type variable $\alpha$ and a type $\tau$, we define a type scheme $\sigma$ by the following grammar rule:
    $$\sigma ::= \tau\ |\ \forall \alpha . \tau$$

    We call the former case a monomorphic type scheme and the latter a polymorphic type scheme.
\end{defn}

\begin{defn}[Typing in HM]
    Typing is assigning a type scheme to an expression instead of assigning a type in definition \ref{defn:typingSTLC}.
\end{defn}

\subsubsection{Common Typing extensions to HM}
\label{sssec:typingExt}

In the HM type system, it is common to introduce the following type operators: cartesian products (tuples): $\times$; disjoint sums (tagged unions): $+$; and lists. The existence of these is not required for the type system, and so we will avoid them going forward. Their reintroduction is trivial as they can be described as regular type constructors.

\subsection{Type Instantiation in HM}

The definition of type instantiation carries over from the system F and the simply-typed lambda calculus (definition \ref{defn:typeInstance}), considering substitution as described in definition \ref{ssec:substitutionF}, but it also applies to \emph{type schemes}.

\begin{defn}[Generic Instance in HM]
    A notable change to the notion of instances is the definition of a \textbf{generic instance} of a type scheme $\sigma = \forall \alpha_1, \alpha_2 \cdots \alpha_n . \tau$ which is defined as $\sigma' = \forall \beta_1, \beta_2 \cdots \beta_m . \tau'$ such that $\tau' = \tau S$ for some type substitution $S = [\alpha_1 := \tau_1, \cdots \alpha_n := \tau_n]$, where type variables $\beta_i$ are not free in $\sigma$. We then write $\sigma > \sigma'$. \cite{damas1982principal}
\end{defn}

Note that $\sigma > \sigma'$ implies $\sigma S > \sigma' S$, however it does not imply $\sigma S > \sigma$.

\subsection{Type Inference in HM}

\begin{defn}[Inference rules of HM]
    For a variable $x$, expressions $e, e'$, a set of assumptions $\Gamma$ ($\Gamma_x$ stands for the set of assumption $\Gamma$ with any assumption about $x$ removed), type schemes $\sigma, \sigma'$, types $\tau, \tau'$, and a type variable $\alpha$, we define the HM type system by the following inference rules:
    $$\boxed{\renewcommand{\arraystretch}{2.2}\begin{array}{l c}
        \infer[\text{(Variable)}]{\Gamma \vdash x : \sigma}{x : \sigma \in \Gamma} \\
        \infer[\text{(Application)}]{\Gamma \vdash e e' : \tau}{\Gamma \vdash e : \tau' \rightarrow \tau \quad \Gamma \vdash e' : \tau'} \\
        \infer[\text{(Abstraction)}]{\Gamma \vdash \lambda x . e : \tau' \rightarrow \tau}{\Gamma_x \cup \{x : \tau'\} \vdash e : \tau} \\
        \infer[\text{(Instantiation)}]{\Gamma \vdash e : \sigma'}{\Gamma \vdash e : \sigma & \sigma > \sigma'} \\
        \infer[\text{(Generalization)}]{\Gamma \vdash e : \forall \alpha . \sigma}{\Gamma \vdash e : \sigma & \alpha \text{ not free in }\Gamma} \\
        \infer[\text{(Let Polymorphism)}]{\Gamma \vdash \text{let } x = e \text{ in } e' : \tau}{\Gamma \vdash e : \sigma \quad \Gamma_x \cup \{x :\sigma\} \vdash e' : \tau}
    \end{array}}$$
\end{defn}

The \emph{instantiation} rule replaces the $\forall$-elimination rule of the system F. Apart from that, the \emph{let polymorphism} rule is the only real extension of the rules, balancing the type restriction introduced before.

We will introduce the algorithm W \ref{w}. It produces the \emph{principal type scheme} for the given program (expression). But before we describe the algorithm, we need to introduce the type unification algorithm, which is used in its definition.

\subsubsection{Most General Unification}

\begin{algorithm}[t]
\caption{Unification Algorithm \cite{robinson1965machine}}
\label{mgu}
\begin{algorithmic}[1]
\Function{$mgu$}{$\tau, \tau'$}
    \State $S_0 \gets \epsilon$
    \For{$k = 0$ to $\infty$}
    \If{$\tau S_k = \tau' S_k$}
        \Return $S_k$
    \EndIf
    \State Let $\tau_k, \tau'_k$ be leftmost corresponding well-formed sub-expressions in which $\tau$ and $\tau'$ differ \Comment if the expressions are represented by syntactic trees, $\tau_k$ and $\tau'_k$ have the same relative path from their respective roots - then leftmost means having the lowest inorder rank).
    \If{$\tau_k$ is a variable not occurring in $\tau'_k$}
        \State $S_{k+1} \gets S_k [\tau_k := \tau'_k]$
    \ElsIf{$\tau'_k$ is a variable not occurring in $\tau_k$}
        \State $S_{k+1} \gets S_k [\tau'_k := \tau_k]$
    \Else
        \State Terminate with an \textit{occurrence check} error
    \EndIf
    \EndFor
\EndFunction
\end{algorithmic}
\end{algorithm}

One of the main parts of the type inference algorithm is the type unification, and since we are usually after principal types, the most general unification.

\begin{defn}[Most General Unifier]
    The most general unifier of types $\tau, \tau'$ is a unifier $S$ such that for every other unifier $U$ of $\tau$ and $\tau'$ there exists a substitution $R$ such that $U = S \circ R$. \cite{damas1982principal}
\end{defn}

We will further require the algorithm to use only variables from $\tau$ and $\tau'$.

We present an algorithm \ref{mgu}, which is a variation of Robinson's unification algorithm. We will use a modification of this algorithm even in the next type system.

\subsubsection{The Type Assignment Algorithm W}

\begin{algorithm}[t]
\caption{The algorithm W \cite{milner1978theory}}
\label{w}
\begin{algorithmic}[1]
\Function{W}{$\Gamma, e$}
    \If{$e$ is a variable $\wedge\ e : \forall \overline{\alpha} . \tau' \in \Gamma$}
    \State let $\overline{\beta}$ be a list of \emph{new} type variables s.t. $|\overline{\beta}| = |\overline{\alpha}|$
    \State $(S, \tau) \gets (\epsilon, \tau' [\overline{\alpha} := \overline{\beta}])$
    \ElsIf{$e$ \textbf{matches} $e_1 e_2$}
    \State $(S_1, \tau_1) \gets W(\Gamma, e_1)$
    \State $(S_2, \tau_2) \gets W(\Gamma S_1, e_2)$
    \State let $\beta$ be a \emph{new} type variable
    \State $S_3 \gets mgu (\tau_1 S_2, \tau_2 \rightarrow \beta)$
    \State $(S, \tau) \gets (S_1 \circ S_2 \circ S_3, \beta S_3)$
    \ElsIf{$e$ \textbf{matches} $\lambda x . e_1$}
    \State let $\beta$ be \emph{new} type variable
    \State $(S, \tau_1) \gets W(\Gamma_x \cup \{x : \beta\}, e_1)$
    \State $\tau \gets (\beta S \rightarrow \tau_1)$
    \ElsIf{$e$ \textbf{matches} $\text{let } x = e_1 \text{ in } e_2$}
    \State $(S_1, \tau_1) \gets W(\Gamma, e_1)$
    \State $(S_2, \tau) \gets W(\Gamma_x S_1 \cup \{close_{\Gamma S_1}(\tau_1)\}, e_2)$
    \State $S = S_1 \circ S_2$
    \Else
    \State Terminate with an error stating that $e$ is not typeable
    \EndIf
    \State \Return $(S, \tau)$
\EndFunction
\end{algorithmic}
\end{algorithm}

Before we describe the algorithm itself, we first need to define the closure of a type $\tau$ under assumptions $\Gamma$:

\begin{defn}[$close_\Gamma(\tau)$]
    \label{defn:close}
    $$close_\Gamma(\tau) := \forall \alpha_1, \dots \alpha_n . \tau$$
    where $\{\alpha_1, \dots \alpha_n\} := free(\tau) \setminus free(\Gamma)$, $free$ stands for the set of free variables, see the definition \ref{defn:boundFree}.
\end{defn}

The algorithm \ref{w} is a function $W$ which takes a pair $(\Gamma, e)$ where $\Gamma$ is a set of assumptions and $e$ is an expression in the context of $\Gamma$. Then $W$ returns a pair $(S, \tau)$ such that:

$\Gamma S \vdash e : \tau$ and furthermore $close_{\Gamma S}(\tau)$ is a principal type scheme of $e$ under $\Gamma S$.

\subsection{Extensions of the HM Type System}

Pure HM allows for just parametric polymorphism. However, without any ad-hoc polymorphism or subtyping, this means there is only one principal type for every symbol in the program and just one definition. This is useful in cases where an algorithm (or a function) works on parameters of arbitrary types. A good example of that could be the $K$ combinator (sometimes called `const function'). We call these algorithms generic.

The problem is that some functions work only on a particular type, for example, basic arithmetic functions. For that reason, we would like to be able to say that it makes sense to add two integers, or two real numbers, but not adding two boolean values.

A trivial extension would be the introduction of tagged unions, see \emph{Typing extensions} in \ref{sssec:typingExt}. If we were to define the $+$ operator as $(+) : Int + Real \rightarrow Int + Real \rightarrow Int + Real$, we solve the problem with booleans, but one different problem comes to the surface: we would like to say that adding two integers yields an integer whereas adding two real numbers yields a real number, one more argument could also be that we would like to forbid adding integers to real numbers (maybe we want to require all data conversions be explicit).

So what we would like is some way of expressing that $(+)$ takes two parameters of a type we can perform addition on and that the function yields us a result of the same type. This can be achieved by \textbf{type classes}.

\subsubsection{Type Classes and their Methods and Instances}

The idea of type classes (parametrized by generic types) is that we put constraints on types of some variables (or functions) we use in our algorithms. Those constraints tell us that the usage of these algorithms is valid only in the context where the constraints are satisfied (this allows us to implement otherwise impossible generic algorithms, for example, sorting, that would be nonsensical for some types).

And also, the functions defined inside a type class (we will call them \textbf{methods}) can have a specialized implementation for each \emph{instance}. This was the main motivation for their creation: $Eq$ class with a method $==$ which needs to be specialized for each type. \cite{hall1994type}

\textbf{Instances} of type classes (parametrized by a type instance of the type of the corresponding type class) are what allows for ad-hoc polymorphism in a very well controlled manner \cite{wadler1989make}, they are a limited (but powerful) alternative to classic overloading (we will describe overloading shortly). It differs from classic overloading in that all instance \emph{methods} have to have typings that are instances of the principal typing of the generic definition. Furthermore, the types of instances are not allowed to overlap (share a common type instance - this can be proven by the unification algorithm) - this "no overlapping" rule will be important for instantiation (generating code with concrete monotypes: overlapping would mean having, for the given parameters, two implementations simultaneously).

Type classes can be added to the algorithm \ref{w} $W$ with only little changes, for example, a simplified version of the implementation presented in Typing Haskell in Haskell (THIH) \cite{jones1999typing}.

\subsubsection{Subtyping and Overloading}

Subtyping and overloading are slightly different approaches to the same problem we discussed with specifying what types the $(+)$ function can act on.

If we want to use either of those, we have to constrain them somehow, or they make type inference undecidable. We will consider the variant defined by professor Geoffrey S. Smith \cite{smith1993polymorphic}.

\textbf{Subtyping}, however, does not solve that problem very well as if we specify that $Int$ is a subtype of $Real$, then we have to accept that $(+)$ applied to two real operands can still return an integer. That does not seem that bad from the mathematical point of view, but it can lead to implicit data conversions, which can be undesirable, especially if we were to use the type system in the context of low-level programming.

\textbf{Overloading}, giving a symbol more definitions, then is not only undecidable in the context of parametric polymorphism of HM but it also completely contradicts the ideas of parametric polymorphism, where a function's type should describe the function and its effect on the program (in languages like Haskell or ML, the type also describes data dependencies). \cite{palsberg2012overloading}

\section{Overloading with Type Classes}

Type classes, as stated before, are good for stating an algorithm makes sense only for some types or that it can have a different implementation for each type, usually to enhance performance by writing specialized algorithms for common or critical types instead of writing a generic slow algorithm.

The usefulness of classes can be shown on `Ord'. This constraint requires that two variables of type $\alpha$ s.t. $Ord(\alpha)$, we can say `$\alpha$ is (in) Ord'. But not everything can be partially ordered, take directed graphs, for example, where the reachability of one vertex from another is in `Ord' only if the graph is a directed acyclic graph, in which case we can use algorithms expecting `Ord' on problems concerning reachability, but not otherwise.

`Ord' can also show how we could want different instances for different types. Let us say `Ord' has one method $<=$ of type $a \rightarrow a \rightarrow Bool$ with the expected behavior.

For two `Int's we just compare them and return whether the first is lower than the other, but if we have, let us say `Customer's, with different Id numbers as keys in some map (or dictionary), then we do not need to compare whole customers, but just their Ids.

\section{Haskell Type System}

In this section, we will describe an older version of Haskell, as presented in THIH \cite{jones1999typing}, without any pattern-matching and other features not relevant to the type system. The currently used Haskell type system is significantly more complicated and based on constraint solvers to support various new type system features that are not relevant for the goals of this thesis.

The syntax is that of HM but extended by the possibility of giving explicit typings (with type schemes, see the definition \ref{defn:typeSchemes}) and defining/using type constructors and type classes. See THIH \cite{jones1999typing} for an actual implementation we will use in the CHM compiler (C with Hindley-Milner).

\subsection{Type Constructors}

Haskell generalizes the HM type system and the HM syntax mostly via the notion of type constructors. If they take no type arguments, we call them nullary. Those are the ones we already call \emph{types} in the HM type system. We call a type constructor n-ary when it takes exactly $n$ type arguments to form a (nullary) value type. For example, the type constructor $(\rightarrow)$ is binary, and it constructs the type of a function.

We can formally define type constructors similarly to function applications:

\begin{defn}[Type Constructor in the Haskell Type System]
    A type constructor is a pair $TC : k$, where $TC$ is the name of the type constructor, and $k$ is its \emph{kind}. But we usually refer to them only by their name.

    We will perform application on them. This application has a special semantic meaning for each applied type constructor, for example: $\text{list } a$ is a list of variables of the type $a$, and $a \rightarrow b$ ($\rightarrow$ is used in the \emph{infix} notation) is a function taking an argument of the type $a$ and returning a value of the type $b$.
\end{defn}

\subsection{Kinds}

The next thing the Haskell type system adds to the HM type system is the notion of kinds. Kinds are used for the classification of type constructors and for checking the validity of their usage in the program.

Kinds are related to types similarly to how types are related to functions and variables, and they follow their syntactic rules but with type constructors instead of values. The most simple kind $*$ (star) represents all types capable of having a value: all legal types from the HM type system would fall into this category.

Then we have kinds of a form $k_1 \rightarrow k_2$ which take a type of a kind $k_1$ and return a type of kind $k_2$. For example a $(\rightarrow)$ type constructor has a kind $* \rightarrow (* \rightarrow *)$, usually written with considering right associativity as $* \rightarrow * \rightarrow *$. This means $(\rightarrow)$ takes two types and returns a new type - this reflects the fact that a type of a function is defined by its parameter's type and its return type.

Kinds are less complex then types as they are defined recursively only from the two rules defined in the previous two paragraphs.

For example, we can define a very common type constructor $List: * \rightarrow *$, which takes a type $a$ and constructs a type representing a linked list of values of type $a$. We can use this type constructor to show the reason behind the notion of kinds as it would not make much sense for a value to have a type just $List$ without it being applied to any type: "a list of what?"

\begin{defn}[Kinds]
    For kinds $k', k''$ a kinds $k$ is grammatically defined as follows:
    $$k ::= *\ |\ k' \rightarrow k''$$
\end{defn}

\subsection{Types and Type Schemes in the Haskell Type System}

We will define types formally as:

\begin{defn}[Type in the Haskell Type System]
    The types are defined, quite similarly to the definition \ref{defn:typeHM}. For types $\tau', \tau''$, a type variable $\alpha$, and a type constructor $c$ (with $k$ being its kind), we define a type $\tau$ by the following grammar rule:
    $$\tau ::= \alpha \ |\ c : k\ |\ \tau' \tau''$$

    Note that we consider the $(\rightarrow)$ operator (used in the previous definition) a regular type constructor of the kind $* \rightarrow * \rightarrow *$.
\end{defn}

Type schemes will be defined similarly to those of the HM type system, differing only in what we consider a type and in that there can be constraints that the types have to satisfy. It should also be noted that only types of the $*$ kind can be assigned to values.

\begin{defn}[Type Scheme in the Haskell Type System]
    For a qualified type $qt$ and $n$ distinct type variables $\alpha_1, \dots \alpha_n$, we define a type scheme $\sigma$ by the following grammar rule:
    \label{defn:typeSchemes}
    $$\sigma ::= \forall \alpha_1, \dots \alpha_n . qt\ |\ qt$$
\end{defn}

\subsubsection{Qualified Types and Predicates}

A qualified type is a type, with an added set of predicates every instance of it has to satisfy.

\begin{defn}[Qualified Type and Predicate]
    For distinct type variables $\beta_1, \dots \beta_m$ such that $\{\beta_1, \dots \beta_m\} \subseteq \{\alpha_1, \dots \alpha_n\}$ ($\alpha_1 \dots \alpha_n$ are taken from the definition \ref{defn:typeSchemes}, qualified types occur always in the context of type schemes), type classes $C_1, \dots C_m$, and a type $\tau$, we define a qualified type $qt$ by the following grammar rule:
    $$qt ::= \{\beta_1 \in C_1, \beta_2 \in C_2, \dots \beta_m \in C_m\} \Rightarrow \tau\ |\ \tau$$

    We call the pair $\beta_i \in C_i$ a \textbf{predicate (constraint)}, it means that $C_i$ is required to have an instance for $\beta_i$.
\end{defn}

Going forward, we will allow the following form for type scheme: $\forall \{\}. qt = qt$, and for qualified types: $\{\} \Rightarrow \tau = \tau$. This simplifies other definitions so we do not need to consider separate cases (for qualified types, of having some predicates, or none).

\paragraph{Type Substitution in the Haskell Type System}

With the introduction of qualified types, we should explicitly state that the type substitution applies to qualified types, just as if the predicates' type parameters were a part of the type itself.

\paragraph{Type Unification in the Haskell Type System}

The addition of type constructors and type classes does not require any change to the definition of type unification. The only notable fact is that unification can succeed only if the unified types share kinds.

\subsection{Type Inference Rules of the Haskell Type System}

The type inference rules for the Haskell type system are almost the same as the type inference rules for the HM type system, but we have to add rules for the inference of the type class constraints:

\begin{defn}[Inference rules of the Haskell Type System]
    For types $\tau, \tau'$, expressions $e, e'$, type variable $x$, set of type variables $\bar{\alpha}$, assumption set $\Gamma$ (where $\Gamma_x$ stands for the set of assumptions without any assumption about x), and predicate sets $\mathbf{P}, \mathbf{P'}$, the HM system with predicates is defined by the following set of inference rules:
    $$\boxed{\renewcommand{\arraystretch}{2.2}\begin{array}{cl}
        \infer[\text{(Variable)}]{\Gamma \vdash x : \sigma}{x : \sigma \in \Gamma} \\
        \infer[\text{(Application)}]{\Gamma \vdash e e' : \mathbf{P} \cup \mathbf{P'}  \Rightarrow \tau}{\Gamma \vdash e : \mathbf{P} \Rightarrow \tau' \rightarrow \tau & \Gamma \vdash e' : \mathbf{P'} \Rightarrow \tau'} \\
        \infer[\text{(Abstraction)}]{\Gamma \vdash \lambda x . e : \mathbf{P} \cup \mathbf{P'} \Rightarrow \tau' \rightarrow \tau}{\Gamma_x \cup \{x : \mathbf{P'} \Rightarrow \tau'\} \vdash e : \mathbf{P} \Rightarrow \tau} \\
        \infer[\text{(Instantiation)}]{\Gamma \vdash e : \sigma'}{\Gamma \vdash e : \sigma & \sigma \geq \sigma'}\\
        \infer[\text{(Generalization)}]{\Gamma \vdash e : \forall \alpha . \sigma}{\Gamma \vdash e : \sigma & \alpha \text{ not free in }\Gamma}\\
        \infer[\text{(Let-polymorphism)}]{\Gamma \vdash \text{let } x = e \text{ in } e' : \mathbf{P} \cup \mathbf{P'} \Rightarrow \tau'}{\renewcommand{\arraystretch}{1.1}\begin{array}{c}\Gamma \vdash e : \forall \bar{\alpha} . \mathbf{P} \Rightarrow \tau \\ \Gamma_x \cup \{x :\forall \bar{\alpha} .  \mathbf{P} \Rightarrow \tau \} \vdash e' : \mathbf{P'} \Rightarrow \tau'\end{array}}
    \end{array}}$$
\end{defn}

Those rules are the same as those of the HM type system with only a trivial difference that complex expressions retain predicates of all of their subexpressions.

\subsection{The Type Inference Algorithm for the Haskell Type System}

The type inference algorithm has to account for the bottom-up propagation of predicates that the inferred types have to satisfy. The basis for the inference algorithm is the algorithm W \ref{w}. For the definition, see THIH \cite{jones1999typing}. In THIH, \lstinline[language=haskell]{tiExpr} is the part of type inference that encompasses the extension of the algorithm W. The definition in THIH differs slightly from what we have discussed in this chapter in that it does not consider just the syntax from the definition \ref{defn:syntaxHM} with the extensions we presented, but it also considers some features of the Haskell language (for example pattern matching).