DOOML: add compiler doing everything for A (well, we hope to add everything)#50
Draft
georgiy-belyanin wants to merge 50 commits intoKakadu:masterfrom
Draft
DOOML: add compiler doing everything for A (well, we hope to add everything)#50georgiy-belyanin wants to merge 50 commits intoKakadu:masterfrom
georgiy-belyanin wants to merge 50 commits intoKakadu:masterfrom
Conversation
This is quasi-initial commit of the DOOML language compiler done in terms of functional language compilers 2025 SPbU course. Basically, it is OCaml-like language with (drastically) truncated list of features. This patch introduces a basic AST and parser with pretty-printing facilities. Only integers, units and tuples are supported yet. This will be implemented by Georgiy Belyanin and Ignatiy Sergeev.
This patch introduces a-normal-form (ANF) as a step in the compiler's middle-end. ANF is responsible to make the function calls only contain variables or immediate values as their arguments which makes the code kind of resemble assembly in the sense that assembler instructions usually work with registers/immediate values not with compound statements. Example: ```ocaml (* Before ANF *) let f = let q = f ((g + sup0) * (2 * i)) in q ;; (* After ANF *) let f = let sup2 = (*) 2 i in let sup5 = (+) g sup0 in let sup6 = (*) sup5 sup2 in let sup7 = (f) sup6 in let q = sup7 in q ;; ```
This patch fixes AST pretty-printing by properly using format boxes. It
is relatively hard to describe what has changed since the previous usage
was completely wrong. Let's provide a few examples instead to notice the
difference between the old formatting and the new one.
Old:
```
let a = 15 in let b = 4 in
let c = 8 in ...
```
New:
```
let a = 15 in
let b = 4 in
let c = 8 in
...
```
Old:
```
let smth = if long_cond then long_one else long_two in
```
New:
```
let smth = if long_cond then
long_one
else
long_two
in
```
Previous implementation of the if-then-else ANF process was wrong. It actually executed both then and else branches and then chosen one of the results. This patch fixes it and now both branches are executed exclusively.
This patch prevents the parser from handling the keywords as identifiers.
This patch introduces plugs, tuples, and units into the parser. In other words, the following code can now be parsed properly. ```ocaml let () = _;; let _ = _;; let (a, b) = _;; ```
This patch fixes the if-then-else (ite) condition parsing. The problem was that it was impossible to use ite inside ite conditions (for some unknown reason). The problem could be fixed by removing a few parser-combinator commits. Let's do it even though it would make errors less useful. Actually, the patch improves pretty-printing too.
This patch significantly improves the ANF stage. It does the following. * It makes ANF accept the whole program as input. * It improves pretty-printing (similarly to AST in the previous patch). * It allows to ANF tuples as follows. Before ANF. ``` let (a, b) = c;; ``` After ANF. ``` let a = nth c 0;; let b = nth c 1;; ``` It also makes it internally use a state monad.
This patch introduces basic closure-conversion and lambda-lifting allowing DOOML to handle closures. They are usually used together so they are added in a single patch. ```ocaml let f = fun a c -> let g = fun b -> a + b in g c ;; (* CC turns it into... *) let f = fun a c -> let g = (fun a b -> a + b) a in g c ;; (* LL turns it into... *) let g = (fun a b -> a + b) a;; let f = fun a c -> g c ;; ```
This patch introduces basic DOOML compilation into RISC-V assembly, namely rv64gc. The patch does not yet verify it. However, a few tests are about to be added soon. Additionally, there is C runtime. It must be compiled by cross-toolchain and linked across the generated assembly in order for everything to work properly.
added 19 commits
January 25, 2026 23:00
8d3edd1 to
f23b6c7
Compare
Let's try to cleanup everything...
f23b6c7 to
b138e2c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces a complete DOOML complier implementation. It was made by Georgiy Belyanin (@georgiy-belyanin) and Ignat Sergeev (@IgnatSergeev).
What features were implemented?
CC/LL.ANF.