Build your own Programmatic Incremental Build System
-This is a programming tutorial where you will build your own programmatic incremental build system in Rust.
+This is a programming tutorial where you will build your own programmatic incremental build system, which is a mix between an incremental build system and an incremental computation system. +Programmatic incremental build systems enable programmers to write expressive build scripts and interactive programs in a regular programming language, with the system taking care of correct incrementality once and for all, freeing programmers from having to manually implement complicated and error-prone incrementality every time.
The primary goal of this tutorial is to provide understanding of programmatic incremental build systems through implementation and experimentation.
-Although the tutorial uses Rust, you don’t need to be a Rust expert to follow it. -A secondary goal of this tutorial is to teach more about Rust through implementation and experimentation, given that you already have programming experience (in another language) and are willing to learn. +
In this programming tutorial you will write Rust code, but you don’t need to be a Rust expert to follow it. +A secondary goal of this tutorial is to teach more about Rust through implementation and experimentation, given that you already have some programming experience (in another language) and are willing to learn. Therefore, all Rust code is available, and I try to explain and link to the relevant Rust book chapters as much as possible.
This is of course not a full tutorial or book on Rust. For that, I can recommend the excellent The Rust Programming Language book. However, if you like to learn through examples and experimentation, or already know Rust basics and want to practice, this might be a fun programming tutorial for you!
-We will first motivate programmatic incremental build systems.
+We will first motivate programmatic incremental build systems in more detail.
Motivation
A programmatic incremental build system is a mix between an incremental build system and an incremental computation system, with the following key properties:
-
-
- Programmatic: Build scripts are regular programs written in a programming language, where parts of the build script implement an API from the build system. This enables build authors to write incremental builds with the full expressiveness of the programming language. +
- Programmatic: Build scripts are regular programs written in a programming language, where parts of the program implement an API from the build system. This enables programmers to write incremental builds scripts and interactive programs with the full expressiveness of the programming language.
- Incremental: Builds are truly incremental – only the parts of a build that are affected by changes are executed.
- Correct: Builds are fully correct – all parts of the build that are affected by changes are executed. Builds are free of glitches: only up-to-date (consistent) data is observed. -
- Automatic: The build system takes care of incrementality and correctness. Build authors do not have to manually implement incrementality. Instead, they only have to explicitly declare dependencies. -
- Multipurpose: The same build script can be used for incremental batch builds in a terminal, but also for live feedback in an interactive environment such as an IDE. For example, a compiler implemented in this build system can provide incremental batch compilation but also incremental editor services such as syntax highlighting or code completion. +
- Automatic: The system takes care of incrementality and correctness. Programmers do not have to manually implement incrementality. Instead, they only have to explicitly declare dependencies.
Teaser Toy Example
-As a small teaser, here is a simplified version of a programmatic incremental toy build script that copies a text file by reading and writing:
-struct ReadFile {
- file: PathBuf
-}
-impl Task for ReadFile {
- fn execute<C: Context>(&self, context: &mut C) -> Result<String, io::Error> {
- context.require_file(&self.file)?;
- fs::read_to_string(&self.file)
- }
+To show the benefits of a build system with these key properties, here is a simplified version of the programmatic incremental build script for compiling a formal grammar and parsing text with that compiled grammar, which is the build script you will implement in the final project chapter.
+This simplified version removes details that are not important for understanding programmatic incremental build systems at this moment.
+
+
+
+Don’t worry if you do not (fully) understand this code, the tutorial will guide you more with programming and understanding this kind of code.
+This example is primarily here to motivate programmatic incremental build systems, as it is hard to do so without it.
+
+
+pub enum ParseTasks {
+ CompileGrammar { grammar_file_path: PathBuf },
+ Parse { compile_grammar_task: Box<ParseTasks>, program_file_path: PathBuf, rule_name: String }
}
-struct WriteFile<T> {
- task: T,
- file: PathBuf
+pub enum Outputs {
+ CompiledGrammar(CompiledGrammar),
+ Parsed(String)
}
-impl<T: Task> Task for WriteFile<T> {
- fn execute<C: Context>(&self, context: &mut C) -> Result<(), io::Error> {
- let string: String = context.require_task(&self.task)?;
- fs::write(&self.file, string.as_bytes())?;
- context.provide_file(&self.file)
+
+impl Task for ParseTasks {
+ fn execute<C: Context>(&self, context: &mut C) -> Result<Outputs, Error> {
+ match self {
+ ParseTasks::CompileGrammar { grammar_file_path } => {
+ let grammar_text = context.require_file(grammar_file_path)?;
+ let compiled_grammar = CompiledGrammar::new(&grammar_text, Some(grammar_file_path))?;
+ Ok(Outputs::CompiledGrammar(compiled_grammar))
+ }
+ ParseTasks::Parse { compile_grammar_task, program_file_path, rule_name } => {
+ let compiled_grammar = context.require_task(compile_grammar_task)?;
+ let program_text = context.require_file_to_string(program_file_path)?;
+ let output = compiled_grammar.parse(&program_text, rule_name, Some(program_file_path))?;
+ Ok(Outputs::Parsed(output))
+ }
+ }
}
}
fn main() {
- let read_task = ReadFile {
- file: PathBuf::from("in.txt")
+ let compile_grammar_task = Box::new(ParseTasks::CompileGrammar {
+ grammar_file_path: PathBuf::from("grammar.pest")
+ });
+ let parse_1_task = ParseTasks::Parse {
+ compile_grammar_task: compile_grammar_task.clone(),
+ program_file_path: PathBuf::from("test_1.txt"),
+ rule_name: "main"
};
- let write_task = WriteFile {
- task: read_task,
- file: PathBuf::from("out.txt")
+ let parse_2_task = ParseTasks::Parse {
+ compile_grammar_task: compile_grammar_task.clone(),
+ program_file_path: PathBuf::from("test_2.txt"),
+ rule_name: "main"
};
- Pie::default().new_session().require(&write_task);
+
+ let mut context = IncrementalBuildContext::default();
+ let output_1 = context.require_task(&parse_1_task).unwrap();
+ println("{output_1:?}");
+ let output_2 = context.require_task(&parse_2_task).unwrap();
+ println("{output_2:?}");
}
-The unit of computation in a programmatic incremental build system is a task.
-A task is kind of like a closure, a function along with its inputs that can be executed, but incremental.
-For example, the ReadFile
task carries the file path it reads from.
-When we execute
the task, it reads from the file and returns its text as a string.
-However, due to incrementality, we mark the file as a require_file
dependency through context
, such that this task is only re-executed when the file changes!
-Note that this file read dependency is created while the task is executing.
-We call these dynamic dependencies.
-This is one of the main benefits of programmatic incremental build systems: you create dependencies while the build is executing, instead of having to declare them upfront!
-Dynamic dependencies are also created between tasks.
-For example, WriteFile
carries a task as input, which it requires with context.require_task
to retrieve the text for writing to a file.
-We’ll cover how this works later on in the tutorial.
-For now, let’s zoom back out to the motivation of programmatic incremental build systems.
-Back to Motivation
+This is in essence just a normal (pure) Rust program: it has enums, a trait implementation for one of those enums, and a main
function.
+However, this program is also a build script because ParseTasks
implements the Task
trait, which is the core trait defining the unit of computation in a programmatic incremental build system.
+Tasks
+A task is kind of like a closure, a function along with its inputs that can be executed, but incremental.
+For example, ParseTasks::CompileGrammar
carries grammar_file_path
which is the file path of the grammar that it will compile.
+When we execute
a ParseTasks::CompileGrammar
task, it reads the text of the grammar from the file, compiles that text into a grammar, and returns a compiled grammar.
+Incremental File Dependencies
+However, we want this task to be incremental, such that this task is only re-executed when the grammar_file_path
file changes.
+Therefore, execute
has a context
parameter which is an incremental build context that tasks use to tell the build system about dependencies.
+For example, ParseTasks::CompileGrammar
tells the build system that it requires the file with context.require_file(grammar_file_path)
, marking the file as a read dependency.
+It is then the responsibility of the incremental build system to only execute this task if the file has changed.
+Dynamic Dependencies
+Note that this file dependency is created while the task is executing.
+We call these dynamic dependencies, as opposed to static dependencies that are hardcoded into the build script.
+Dynamic dependencies enable the programmatic part of programmatic incremental build systems, because dependencies are made while your program is running, and can thus depend on values computed earlier in your program.
+Another benefit of dynamic dependencies is that they enable exact dependencies: the dependencies of a task exactly describe when the task should be re-executed, increasing incrementality.
+With static dependencies, you often have to over-approximate dependencies, leading to reduced incrementality.
+Incremental Task Dependencies
+Dynamic dependencies are also created between tasks.
+For example, ParseTasks::Parse
carries compile_grammar_task
which is an instance of the ParseTasks::CompileGrammar
task to compile a grammar.
+When we execute
a ParseTasks::Parse
task, it tells the build system that it depends on the compile grammar task with context.require_task(compiled_grammar_task)
, but also asks the build system to return the most up-to-date (consistent) output of that task.
+It is then the responsibility of the incremental build system to check whether the task is consistent, and to re-execute it only if it is inconsistent.
+If compile_grammar_task
was never executed before, the build system executes it, caches the compiled grammar, and returns the compiled grammar.
+Otherwise, to check if the compile grammar task is consistent, we need to check the file dependency to grammar_file_path
that ParseTasks::CompileGrammar
created earlier.
+If the contents of the grammar_file_path
file has changed, the task is inconsistent and the build system re-executes it, caches the new compiled grammar, and returns it.
+Otherwise, the build system simply returns the cached compiled grammar.
+The main
function creates instances of these tasks, creates an IncrementalBuildContext
, and asks the build system to return the up-to-date outputs for two tasks with context.require_task
.
+This is the essence of programmatic incremental build systems.
+In this tutorial, we will define the Task
trait and implement the IncrementalBuildContext
.
+However, before we start doing that, I want to first zoom back out and discuss the benefits of programmatic incremental build systems.
+Benefits
I prefer writing builds in a programming language like this, over having to encode a build into a YAML file with underspecified semantics, and over having to learn and use a new build scripting language with limited tooling.
By programming builds, I can reuse my knowledge of the programming language, I get help from the compiler and IDE that I’d normally get while programming, I can modularize and reuse parts of my build as a library, and can use other programming language features such as unit testing, integration testing, benchmarking, etc.
Programmatic builds do not exclude declarativity, however.
@@ -257,7 +302,7 @@
Back to
A task is re-executed when one or more of its dependencies become inconsistent.
For example, the WriteFile
task from the example is re-executed when the task dependency returns different text, or when the file it writes to is modified or deleted.
This is both incremental and correct.
-Disadvantages
+Disadvantages
Of course, programmatic incremental build systems also have some disadvantages.
These disadvantages become more clear during the tutorial, but I want to list them here to be up-front about it:
@@ -269,7 +314,7 @@ We have developed PIE, a Rust library implementing a programmatic incremental build system adhering to the key properties listed above.
It is still under development, and has not been published to crates.io yet, but it is already usable
If you are interested in experimenting with a programmatic incremental build system, do check it out!
-
In this tutorial we will implement a subset of PIE, the Rust library.
+
In this tutorial we will implement a subset of PIE.
We simplify the internals in order to minimize distractions as much as possible, but still go over all the key ideas and concepts that make programmatic incremental build systems tick.
However, the idea of programmatic incremental build systems is not limited to PIE or the Rust language.
You can implement a programmatic incremental build systems in any general-purpose programming language, or adapt the idea to better fit your preferences and/or requirements.
diff --git a/1_programmability/1_api/index.html b/1_programmability/1_api/index.html
index 8c8406d..700b7a1 100644
--- a/1_programmability/1_api/index.html
+++ b/1_programmability/1_api/index.html
@@ -85,7 +85,7 @@
diff --git a/1_programmability/2_non_incremental/index.html b/1_programmability/2_non_incremental/index.html
index 6db3895..335fa78 100644
--- a/1_programmability/2_non_incremental/index.html
+++ b/1_programmability/2_non_incremental/index.html
@@ -85,7 +85,7 @@
@@ -219,10 +219,10 @@
Context module<
├── pie
│ ├── Cargo.toml
│ └── src
-│ ├── lib.rs
-│ └── context
-│ ├── mod.rs
-│ └── non_incremental.rs
+│ ├── context
+│ │ ├── non_incremental.rs
+│ │ └── mod.rs
+│ └── lib.rs
└── Cargo.toml
Confirm your module structure is correct by building with cargo build
.
Simple Test
Run the test by running cargo test
.
The output should look something like:
Compiling pie v0.1.0 (/pie)
- Finished test [unoptimized + debuginfo] target(s) in 0.37s
- Running unittests src/lib.rs (target/debug/deps/pie-7f6c7927ea39bed5)
+ Finished test [unoptimized + debuginfo] target(s) in 0.23s
+ Running unittests src/lib.rs (target/debug/deps/pie-4dd489880a9416ea)
running 1 test
test context::non_incremental::test::test_require_task_direct ... ok
diff --git a/1_programmability/index.html b/1_programmability/index.html
index 110a840..21cee74 100644
--- a/1_programmability/index.html
+++ b/1_programmability/index.html
@@ -85,7 +85,7 @@
diff --git a/2_incrementality/1_require_file/index.html b/2_incrementality/1_require_file/index.html
index 711b12a..93e6691 100644
--- a/2_incrementality/1_require_file/index.html
+++ b/2_incrementality/1_require_file/index.html
@@ -85,7 +85,7 @@
@@ -396,19 +396,19 @@