DenotativeLC is a lambda calculus with algebraic data types developed in a unique style which I call “Denotative Relational Literate Programming” with a novel emedding technique I call “Tagless-complete”.
- Requirements: Emacs, Cabal, Haskell
Head to the Main function
heading and do C-c
, use the following GHCi interperter.
fst (v, s v)
- CC-BY-SA
A copy of the license is included in the repository.
- AGPL v3
A copy of the license is included in the repository.
Denotational design[fn:1] is a methodology developed by Conal Elliott, to improve the state of software development, currently most software is developed without any design and a dose of (let’s just do it), this leaves a bunch of problems like technical debt, leaky abstraction and much more, denotational design at it’s essence is to make a mathematical model of your types and operations by simple math that has been discovered for other reasons, for example we can denotate a stack data structure to be a function that takes in a natural number and returns an a
, so Nat -> a
, then we will denotate the functions that use the Stack like empty
for an empty stack.
- I recommend listening to the following podcasts: https://www.typetheoryforall.com/2022/05/09/17-The-Lost-Elegance-of-Computation-(Conal-Elliott).html https://www.typetheoryforall.com/2022/08/04/21-Conal-Eliott-2.html
- Article on denotational design: https://github.com/bolt12/advent-of-haskell-dd
Literate programming[fn:2] is a paradigm where you mix code and prose (concerning the program at hand) at the same location, this has many benefits for example, readability, it is much easier for the reader to understand the underlying code compared to just inline comments or no comments at all, I think this is an extremely relatable factor for a lot us programmers, as Harold Abelson said:
Programs must be written for people to read, and only incidentally for machines to execute.
I am personally interested in exploring combining denotational design and literate programming together, where you specify your design with denotational design and only then, think about the implementation.
Okay, so we have denotational design and we use literate programming to put it at the forefront, we need a structure or better yet an architecture, for the implementation of said program, this is where functional relational programming[fn:3] comes at the foreground, it is an architecture designed to reduce complexity via using functional and relational programming, it turns out that it actually works nicely with literate programming and by extension denotational design.
Literate programming provides the structure needed, denotational design provides the design and functional relational programming provides an efficient implementation architecture.
- Structure: Literate Programming
- Design: Denotational Design
- Implementation: Functional Relational Programming
I went on a quest to find a way to to make a programming language using denotational semantics, after some time, I discovered tagless-final, let’s call it executable denotational semantics.
Tagless-final[fn:4] is a technique of embedding domain specific languages (DSLs) in to the host langauge (Haskell in my case), instead of using data constructors as a representation of DSL terms, you typically use functions inside of a typeclass (or ML modules), this allows excellent extensibility, it is a fine solution to the expression problem, the main attraction for me is that from the author’s words:
Doing a tagless-final embedding is literally writing a denotational semantics for the DSL – in a host programming language rather than on paper.
That quote got me excited and after understanding the technique, I can say it does a really good job at it. There is one huge pitfall though for creating DSLs and that’s that, A tagless-final embedding is a shallow embedding, meaning we have no AST at our disposal, for applications like parsing or optimization.
I have looked into many papers to solve this problem like ~Deep embedding with class~[fn:5] and ~Combining Shallow and Deep EDSLs~[fn:6], the former uses an AST data type for a representation and includes a backdoor via existenial type for extension, the issue comes from the fact that it loses the denotational semantic properties that makes tagless-final so attractive, while the latter starts with a deep embedding with a shallow wrapper on top, which automatically makes the technique non-applicable for my needs, a bad solution for this problem is to lift term level functions to the type level and embed those functions to constructors, while this works, it is a far cry from a realistic and satisfactory programming experience since in Haskell, type level programming is inadequate, with no type level lambdas or first class type level functions (called type families).
Tagless-complete is a combination of tagless-final and tagless-initial, tagless-inital is equivalent to tagless-final except it uses a GADT for the semantic functions (or constructors in this case), we then translate into a final embedding via a translation function, finally we evaluate via tagless-final.
DenotativeLC is a simply typed lambda calculus with algebraic data types, the following defines the inference rules in DenotativeLC.
Variables are de bruijn indice.
Lambda abstraction.
Similar as in the types heading we can also use Haskell’s variable semantics therefore we don’t have to specify variables in our language. Variables are de bruijn indices, we can specify them as:
variable0 :: wrap (h, a) a
And since we use de bruijn we have to specify the inductive case as:
variableSucc :: wrap h a -> wrap (h, any) a
Abstractions correspond to function definitions, which we can think of as a function that takes an indentifier and an expression:
abstraction :: wrap (env, a) b -> wrap env (a -> b)
Application is just function application and can be specified as:
application :: wrap env (a -> b) -> wrap env a -> wrap env b
Pair is the combination of two types, in Haskell it is the (,)
type.
pair :: wrap env a -> wrap env b -> wrap env (a, b)
fst is grabbing the first value of the pair type.
fst :: wrap env (a, b) -> wrap env a
Snd is identical to Fst except it grabs the second value.
snd :: wrap env (a, b) -> wrap env b
Unit is the terminal object of the CCC, we can describe the function as:
unit :: wrap env ()
inl
comes from co-cartesisan, it is the dual of cartesian, in category theory we can easily receieve a dual from just flipping the arrows, this is an example of that, instead of the pair type we get Either
.
inl :: wrap h a -> wrap h (Either a b)
inr
is the right side equivalent to inl
.
inr :: wrap h b -> wrap h (Either a b)
The parser is another common component of a programming language, in DenotativeLC, with parser combinators we can build smaller parsers into one big one therefore simplifying my work so that I only need need to specify one function but first we need to build a parser type that we will use.
The following is a parser type with Void
for errors and String
being the streaming type, Parsec
comes from the Megaparsec library.
type Parser = Parsec Void String
We can easily specify a single function from the combined lower level parsers with the following:
parseDenotativeLC :: Parser (wrap h a)
The evaluator is a function that evaluates terms in the closed term
eval :: wrap () a -> a
The full API:
wrap env a
variable0 :: wrap (env, a) a
variableSucc :: wrap env a -> wrap (env, any) a
application :: wrap env (a -> b) -> wrap env a -> wrap env b
abstraction :: wrap (env, a) b -> wrap env (a -> b)
pair :: wrap env a -> wrap env b -> wrap env (a, b)
fst :: wrap env (a, b) -> wrap env a
snd :: wrap env (a, b) -> wrap env b
unit :: wrap env ()
inl :: wrap h a -> wrap h (Either a b)
inr :: wrap h b -> wrap h (Either a b)
parseDenotativeLC :: Parser (wrap h a)
eval :: wrap () a -> a
We are going to make semantic functions that map the lambda calculus world to a closed cartesian category[fn:7], first we have to define the semantic domain though.
Note: we are building a denotation with the environment being tuples instead of a type-level list since that is much simpler to reason with (it’s also the way I started it before switching).
You can pick any closed cartesian category really but the functor category is simple, so it works out. So let’s model it.
⟦_⟧ :: Lambda (wrap a b) => wrap a b -> (a -> b)
In compiling to categories exr, correspond to the snd
function in a pair, it’s in the cartesian part of the CCC, it’s defined as:
exr (a, b) = b
exl
corresponds to fst and it’s defined as:
exl (a, b) = a
Typing context is a tuple that contains the term and it’s type, it looks like this: ℾ
.
Simple composition. The composition primitive is necessary for a category to be a category so we can use this primitive.
Generally variabels correspond to identity, id
but since we have the typing environment, it infact corresponds to exr
.
⟦variables0⟧ = exr
We also have to inductive case to worry about, which can be defined beautifully as:
⟦variablesSucc e1⟧ = ⟦e1⟧ . exl
Curry is a higher-order function that takes in a function: (a, b) -> c
and curries it to be: a -> b -> c
.
It’s notion in the CCC is the closed part focusing on the expontential type (the function type).
Abstraction in the tagless-final paper is just curry but it’s type arguments a
and b
are flipped, I prefer to use the curry semantics, rather than add new functions, so we must consider that the typing context is unsual where the type identifer is the first and the added argument is the second.
exl extracts the first element of the tuple and we use the second argument of abstraction to apply the function therefore we gat a function a -> b
.
⟦abstraction e1⟧ = curry ⟦e1⟧
The △
operator takes in two terms and constructs a function that is a tuple of those functions, we can specify it as:
f ~△ g = \x -> (f x, g x)
It’s notion is in cartesian part of CCC and it’s the introduction form, cartesian adds products to the category.
apply
is a function that takes in a tuple and apply’s the first term to the
second.
apply
is in the closed part of CCC.
We have what we need to make denotation.
⟦application a b⟧ = apply . ⟦a⟧ △ ⟦b⟧
The △
corresponds perfectly as the introduction form to the pair.
⟦pair e1 e2⟧ = ⟦e1⟧ △ ⟦e2⟧
All the functions concerning products is the cartesian part of the CCC, which has introduction and projections.
Fst is exl.
⟦fst e1⟧ = exl ⟦e1⟧
Snd is exr
⟦snd e1⟧ = exr ⟦e1⟧
The unit
function corresponds to the it
function earlier.
⟦unit e1⟧ = it ⟦e1⟧
inl
and inr
have identical representations therefore their denotations are simple.
⟦inl e1⟧ = inl ⟦e1⟧
⟦inr e1⟧ = inr ⟦e1⟧
Considering, Megaparsec (the library that am using) does not have a denotation in their documentation, I can not in good conscience give a denotation to the parsing function at hand.
The evaluator (for evaluation rather pretty-printing or other purposes) takes in expressions in closed terms meaning only well-typed, we can define the function by applying the argument to unit.
⟦eval e⟧ = ⟦e⟧ ()
This shows the complete denotation, I think it shows the beauty and elegance of denotational design, combined with literate programming.
⟦_⟧ :: Lambda (wrap a b) => wrap a b -> (a -> b)
⟦variables0⟧ = exr
⟦abstraction e1⟧ = curry ⟦e1⟧
⟦variablesSucc e1⟧ = ⟦e1⟧ . exl
⟦application a b⟧ = apply . ⟦a⟧ △ ⟦b⟧
⟦pair e1 e2⟧ = ⟦e1⟧ △ ⟦e2⟧
⟦fst e1⟧ = exl ⟦e1⟧
⟦snd e1⟧ = exr ⟦e1⟧
⟦unit e1⟧ = it ⟦e1⟧
⟦inl e1⟧ = inl ⟦e1⟧
⟦inr e1⟧ = inr ⟦e1⟧
⟦eval e⟧ = ⟦e⟧ ()
Run bash to install MegaParsec.
cabal install --lib megaparsec
Am using advanced GHC extensions to mostly compute at the type level per the requirements of Denotative embedding.
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE PartialTypeSignatures #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE PartialTypeSignatures #-}
GHCi requires us to load imports before using them.
:set -package base
:set -package megaparsec
Importing a parser library and type level programming libraries.
import Text.Megaparsec
import Text.Megaparsec.Char
import Data.Void
This options allows literate programming with Haskell to be much better where it allows to make multi-line functions, (org-babel connects to ghci).
:set +m
<<extensions>>
module DenotativeLC where
<<imports>>
<<category-classes>>
<<ast-class>>
<<category-instances>>
<<reader>>
<<ast-instance>>
<<eval>>
<<parser>>
<<translator>>
main :: IO ()
main = do
line <- getLine
case parse parseDenotativeLC "" line of
Right t -> do
let val = cps t (const undefined)
let evaluated = eval $ toFinal val
print evaluated
main
Left err -> print err
The main type that we are going to use is the function type (->)
, it comes built in with Haskell.
In the out of the tar pit paper, the authors suggest only using relations and more generally the relational algebra for the state part of a program, we adhere to the paper by using record types analogously as relations.
As I said before, record types in Haskell can be analogous to relations (tables in SQL), infact, this approach is used in Persistent which is the most popular ORM in Haskell and the native Haskell database Project-M36 (check this project out, it’s really underrated).
The main relation is the R
relation which has one pair, unR
is the attribute’s name and it’s type is the function type.
Using the deriving functionality we can show that Reader
is actually the a closed bicartesian category.
Let’s define it:
newtype Reader h a = MkReader {unReader :: h -> a} deriving (Category, Cartesian, CoCartesian, Closed, Terminal)
The R
relation is actually isomorphic to the function type since they are representially the same.
First things first, we should show that Reader is actually a bi-cartesian closed category.
class Category wrap where
id' :: wrap a a
(<<) :: (wrap b c) -> (wrap a b) -> (wrap a c)
class Cartesian wrap where
triangle :: (wrap a c) -> (wrap a d) -> (wrap a (c, d))
exl :: wrap (a, b) a
exr :: wrap (a, b) b
class CoCartesian wrap where
inl :: wrap h a -> wrap h (Either a b)
inr :: wrap h b -> wrap h (Either a b)
class Closed wrap where
apply :: wrap ((a -> b), a) b
curry' :: (wrap (a, b) c) -> (wrap a (b -> c))
uncurry' :: (wrap a (b -> c)) -> (wrap (a, b) c)
class Terminal wrap where
it' :: a `wrap` ()
We can now finally make a typeclass to represent each term in our language, we are using HOAS.W e can use the homomorphism principle to show the denotation more clearly.
class (Category wrap, Cartesian wrap, CoCartesian wrap, Closed wrap, Terminal wrap) => AST wrap where
variable0 :: wrap (h, a) a
variableSucc :: wrap h a -> wrap (h, any) a
abstraction :: wrap (h, a) b -> wrap h (a -> b)
application :: wrap h (a -> b) -> wrap h a -> wrap h b
pair :: wrap h a -> wrap h b -> wrap h (a, b)
fst' :: wrap h (a, b) -> wrap h a
snd' :: wrap h (a, b) -> wrap h b
unit :: wrap h ()
inl' :: wrap h a -> wrap h (Either a b)
inr' :: wrap h b -> wrap h (Either a b)
-- Necessary comment for the where clause to be closed, ob-haskell should be improved :)
Since (->)
and Reader are isomorphic, we can write instances of (->)
and use the deriving machinery to get it for free for Reader
.
instance Category (->) where
id' = id
(<<) = (.)
instance Cartesian (->) where
triangle f g = \x -> (f x, g x)
exl = fst
exr = snd
instance CoCartesian (->) where
inl e = \h -> Left (e h)
inr e = \h -> Right (e h)
instance Closed (->) where
apply (f, x) = f x
curry' = curry
uncurry' = uncurry
instance Terminal (->) where
it' = \a -> ()
Now that we have sufficiently shown that DenotativeLC corresponds to a closed bi-cartesisan category, we can make a tagless-final typeclass for the AST.
fe1 e1 = \x -> (unReader e1 x)
fe2 e2 = \x -> (unReader e2 x)
instance AST Reader where
variable0 = MkReader snd
variableSucc v = MkReader $ unReader v . fst
abstraction e1 = MkReader $ curry (unReader e1)
application e1 e2 = MkReader $ apply . (triangle (fe1 e1) (fe2 e2))
pair e1 e2 = MkReader $ triangle (fe1 e1) (fe2 e2)
fst' e1 = MkReader $ \h -> fst $ (unReader e1 h)
snd' e1 = MkReader $ \h -> snd $ (unReader e1 h)
unit = MkReader $ it'
inl' f = MkReader $ (\env -> Left (unReader f env))
inr' f = MkReader $ (\env -> Right (unReader f env))
--
Evaluation is simple with just the function:
eval e = unReader e ()
To be able to parse, we need to have a an AST representation which tagless-final does not have, (remember we are using functions instead of constructors). Am going to use a GADT as an initial encoding, after that use Dynamic to fix type differences and only then can we move to functions and tagless-final.
Initial
is the GADT that I was talking about, it is nearly equivalent to the AST typeclass except we have constructors instead of functions.
data Initial i h a where
Eval :: i h a -> Initial i h a
Variable :: Initial i (h, a) a
VariableSucc :: Initial i h a -> Initial i (h, any) a
Abstraction :: Initial i (h, a) b -> Initial i h (a -> b)
Application :: Initial i h (a -> b) -> Initial i h a -> Initial i h b
Pair :: Initial i h a -> Initial i h b -> Initial i h (a, b)
Fst :: Initial i h (a, b) -> Initial i h a
Snd :: Initial i h (a, b) -> Initial i h b
Unit :: Initial i h ()
Inl :: Initial i h a -> Initial i h (Either a b)
Inr :: Initial i h b -> Initial i h (Either a b)
--
We also need an eliminator to get that type back.
cps :: Exists -> (forall i h a. Initial i h a -> r) -> r
cps (Symantic t) k = k t
Simple parser combinations, we are not using the R type but instead we are overloading the operations
<<exist>>
<<cps>>
data Wrap a = A a | B a | C a | D a | F a
type Parser = Parsec Void String
-- parseVar :: Parser Exists
-- parseVar = string "v" *> pure (Symantic Variable)
parseVar :: Parser Exists
parseVar = string "v" *> pure (Symantic Variable)
parseVariableSucc = do
string "s"
i <- parseVar <|> parseVariableSucc
let b = cps i (const Variable)
pure (Symantic (VariableSucc b))
parseAbstraction :: Parser Exists
parseAbstraction = do
char '\\'
space1
v <- parseVar
let b = cps v (const Variable)
char '.'
pure (Symantic (Abstraction b))
parseApplication :: Parser Exists
parseApplication = do
a <- parseAbstraction
b <- parseVar
let (x, y) = (cps a (const undefined), cps b (const undefined))
pure (Symantic (Application x y))
parsePair :: Parser Exists
parsePair = do
char '('
space1
a <- parseDenotativeLC
let b = cps a (const undefined)
char ','
space1
c <- parseDenotativeLC
let d = cps c (const undefined)
char ')'
pure (Symantic (Pair b d))
parseFst :: Parser Exists
parseFst = do
pair <- parsePair
let a = cps pair (const undefined)
pure (Symantic (Fst a))
parseSnd :: Parser Exists
parseSnd = parseFst
parseUnit :: Parser Exists
parseUnit = string "unit" *> pure (Symantic Unit)
parseInl :: Parser Exists
parseInl = do
string "left"
space1
a <- parseDenotativeLC
let b = cps a (const undefined)
pure (Symantic (Inl b))
parseInr :: Parser Exists
parseInr = do
string "right"
space1
a <- parseDenotativeLC
let b = cps a (const undefined)
pure (Symantic (Inl b))
parseDenotativeLC :: Parser Exists
parseDenotativeLC = parseVar <|> parseVariableSucc <|> parseAbstraction <|> parseApplication <|> parsePair <|> parseFst <|> parseSnd <|> parseInl <|> parseInr
Now that we have an AST, we can then use tagless-final, we just have to translate the initial to the final first.
toFinal :: AST wrap => forall h a. Initial wrap h a -> wrap h a
toFinal (Eval i) = i
toFinal (Variable) = variable0
toFinal (VariableSucc a) = variableSucc (toFinal a)
toFinal (Application a b) = application (toFinal a) (toFinal b)
toFinal (Abstraction a) = abstraction (toFinal a)
toFinal (Pair a b) = pair (toFinal a) (toFinal b)
toFinal (Fst a) = fst' (toFinal a)
toFinal (Snd b) = snd' (toFinal b)
toFinal (Unit) = unit
toFinal (Inl a) = inl' (toFinal a)
toFinal (Inr a) = inr' (toFinal a)
[fn:1]: https://www.typetheoryforall.com/2022/08/04/21-Conal-Eliott-2.html [fn:2]: https://en.wikipedia.org/wiki/Literate_programming [fn:3]: https://okmij.org/ftp/tagless-final/course/lecture.pdf [fn:4]: https://curtclifton.net/papers/MoseleyMarks06a.pdf [fn:5]: https://dl.acm.org/doi/10.1007/978-3-031-21314-4_3 [fn:6]: https://emilaxelsson.github.io/documents/svenningsson2013combining.pdf [fn:7]: https://keikun555.github.io/documents/lambda.pdf