Skip to content

Commit b040dca

Browse files
add compiler options to specific dumping of compilation steps
1 parent 3c78403 commit b040dca

34 files changed

+351
-274
lines changed

README.md

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
# Nautilus: A tracing jit compiler for C++
2+
23
[![Build Nautilus](https://github.com/nebulastream/nautilus/actions/workflows/build.yml/badge.svg?branch=main)](https://github.com/nebulastream/nautilus/actions/workflows/build.yml)
34

4-
Nautilus is a lightweight and adaptable just-in-time (JIT) compiler for C++ projects.
5+
Nautilus is a lightweight and adaptable just-in-time (JIT) compiler for C++ projects.
56
It offers:
7+
68
1. A high-level code generation API that accommodates C++ control flows.
79
2. A tracing JIT compiler that produces a lightweight intermediate representation (IR) from imperative code fragments.
810
3. Multiple code-generation backends, allowing users to balance compilation latency and code quality at runtime.
@@ -16,7 +18,8 @@ The example below demonstrates Nautilus with a simplified aggregation operator,
1618
`ConditionalSum`. This function aggregates integer values based on a boolean mask.
1719
Nautilus introduce `val<>` objects to capture all executed operations in an intermediate representation during tracing.
1820
Depending on the execution context, it can utilize a bytecode interpreter or generate efficient MLIR or C++ code.
19-
This enables Nautilus to trade of performance characteristics and to optimize the generated code towards the target hardware.
21+
This enables Nautilus to trade of performance characteristics and to optimize the generated code towards the target
22+
hardware.
2023

2124
```c++
2225
val<int32_t> conditionalSum(val<int32_t> size, val<bool*> mask, val<int32_t*> array) {
@@ -72,7 +75,7 @@ The codebase is structured in the following components:
7275

7376
### Publication:
7477

75-
This paper discusses Nautilus's architecture and its usage in the NebulaStream query compiler.
78+
This paper discusses Nautilus's architecture and its usage in the NebulaStream query compiler.
7679
Note that it references an earlier version of the code-generation API, which has changed.
7780

7881
```BibTeX
@@ -92,28 +95,31 @@ Note that it references an earlier version of the code-generation API, which has
9295
```
9396

9497
### Related Work:
98+
9599
The following work is related to Nautilus and influenced our design decisions.
96100

97101
* [Tidy Tuples and Flying Start](db.in.tum.de/~kersten/Tidy%20Tuples%20and%20Flying%20Start%20Fast%20Compilation%20and%20Fast%20Execution%20of%20Relational%20Queries%20in%20Umbra.pdf):
98-
This paper describes the low-latency query compilation approach of [Umbra](https://umbra-db.com/).
99-
This work was one of the main motivations for the creation of the Nautilus project and its use in NebulaStream.
102+
This paper describes the low-latency query compilation approach of [Umbra](https://umbra-db.com/).
103+
This work was one of the main motivations for the creation of the Nautilus project and its use in NebulaStream.
100104

101105
* [Flounder](https://vldb.org/pvldb/vol14/p2691-funke.pdf):
102-
Flounder is simple low latency jit compiler that based on [AsmJit](https://asmjit.com/), which is designed for query compilation.
106+
Flounder is simple low latency jit compiler that based on [AsmJit](https://asmjit.com/), which is designed for query
107+
compilation.
103108

104-
* [Build-It](https://buildit.so/):
105-
BuildIt is a framework for developing Domain Specific Languages in C++.
106-
It pioneered the capability of extracting control-flow information form imperative C++ code.
109+
* [Build-It](https://buildit.so/):
110+
BuildIt is a framework for developing Domain Specific Languages in C++.
111+
It pioneered the capability of extracting control-flow information form imperative C++ code.
107112

108113
* [GraalVM](https://www.graalvm.org/):
109-
The GraalVM project provides a framework to implement AST interpreters that can be turned into high-performance code through partial evaluation.
114+
The GraalVM project provides a framework to implement AST interpreters that can be turned into high-performance code
115+
through partial evaluation.
110116

111117
* [MLIR](https://mlir.llvm.org/):
112-
The MLIR project provides a novel approach to building reusable and extensible compiler infrastructure.
113-
Nautilus leverages it as a foundation for its high-performance compilation backend.
118+
The MLIR project provides a novel approach to building reusable and extensible compiler infrastructure.
119+
Nautilus leverages it as a foundation for its high-performance compilation backend.
114120

115121
* [MIR](https://github.com/vnmakarov/mir):
116-
The MIR projects provides a lightweight jit compiler that targets low compilation latency.
117-
Nautilus leverages MIR as a low latency compilation backend.
122+
The MIR projects provides a lightweight jit compiler that targets low compilation latency.
123+
Nautilus leverages MIR as a low latency compilation backend.
118124

119125

docs/options.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Nautilus Runtime Options
2+
3+
Nautilus provides various runtime options to customize the compilation, which are passed when initializing the nautilus
4+
engine.:
5+
6+
| Option | Default | Description |
7+
|-----------------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
8+
| engine.compilation=[true,false] | true | Activates the compilation of nautilus functions. If the flag is false, all nautilus functions will be directly invoked, which is handy for debugging. |
9+
| engine.backend=[mlir,bc,cpp] | mlir | Sets the specific compilation backend for nautilus functions. |
10+
| dump.all=[true,false] | false | Dumbs intermediate representations all compilation steps. |
11+
| dump.after_tracing=[true,false] | false | Dumbs traces directly after trace generation. |
12+
| dump.after_ssa=[true,false] | false | Dumbs traces after SSA generation. |
13+
| dump.after_ir_creation=[true,false] | false | Dumbs the nautilus ir after generation. |
14+
| dump.after_mlir_generation=[true,false] | false | Dumbs the generated mlir if the MLIR backend is used. |
15+
| dump.after_llvm_generation=[true,false] | false | Dumbs the generated llvm if the MLIR backend is used. |
16+
| dump.after_cpp_generation=[true,false] | false | Dumbs the generated cpp code if the CPP backend is used. |
17+
| dump.after_bc_generation=[true,false] | false | Dumbs the generated byte codes if the Bytecode Interpreter (BC) backend is used. |
18+
| dump.toConsole=[true,false] | false | Enables the dumping of intermediate compilation steps to the console. |
19+
| dump.toFile=[true,false] | true | Enables the dumping of intermediate compilation steps to a temp folder. |
20+
| mlir.optimizationLevel=[0,1,2,3] | 3 | Sets the optimization level for the code-generation if the MLIR backend is used. |

nautilus/include/nautilus/JITCompiler.hpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@ namespace nautilus::compiler {
88

99
class Executable;
1010
class CompilationBackendRegistry;
11+
12+
13+
using CompilationUnitID = std::string;
14+
1115
class JITCompiler {
1216
public:
1317
using wrapper_function = std::function<void()>;
@@ -24,5 +28,4 @@ class JITCompiler {
2428
std::unique_ptr<CompilationBackendRegistry> backends;
2529
};
2630

27-
2831
} // namespace nautilus::compiler

nautilus/include/nautilus/std/ostream.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,8 @@ class val<std::basic_ostream<CharT, Traits>> {
4141
public:
4242
explicit val(val<std::basic_ostream<CharT, Traits>*> stream) : stream(stream) {};
4343
template <class _CharT, class _Traits>
44-
val(val<std::basic_ostream<_CharT, _Traits>>& other) : stream(other.stream) {}
44+
val(val<std::basic_ostream<_CharT, _Traits>>& other) : stream(other.stream) {
45+
}
4546

4647
template <class T>
4748
val<std::basic_ostream<CharT, Traits>>& operator<<(val<T>& value) {

nautilus/include/nautilus/std/string.h

Lines changed: 52 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,16 @@ class val<std::basic_string<CharT, Traits>> {
2424
val(val<std::basic_string<CharT, Traits>*> str) : data_ptr(str) {
2525
}
2626
val(const val<const CharT*>& s) : data_ptr(nullptr) {
27-
data_ptr = invoke(+[](const CharT* s) -> base_type* { return new base_type(s); }, s);
27+
data_ptr = invoke(
28+
+[](const CharT* s) -> base_type* { return new base_type(s); }, s);
2829
}
2930

3031
val(const CharT* s) : val(val<const CharT*>(s)) {
3132
}
3233

3334
val<CharT> at(val<size_type> pos) {
34-
return invoke(+[](base_type* ptr, size_type p) -> CharT { return ptr->at(p); }, data_ptr, pos);
35+
return invoke(
36+
+[](base_type* ptr, size_type p) -> CharT { return ptr->at(p); }, data_ptr, pos);
3537
}
3638

3739
/**
@@ -41,19 +43,23 @@ class val<std::basic_string<CharT, Traits>> {
4143
* If pos > size(), the behavior is undefined.
4244
*/
4345
val<CharT> operator[](val<size_type> pos) {
44-
return invoke(+[](base_type* ptr, size_type p) -> CharT { return ptr->operator[](p); }, data_ptr, pos);
46+
return invoke(
47+
+[](base_type* ptr, size_type p) -> CharT { return ptr->operator[](p); }, data_ptr, pos);
4548
}
4649

4750
val<CharT> front() {
48-
return invoke(+[](base_type* ptr) -> CharT { return ptr->front(); }, data_ptr);
51+
return invoke(
52+
+[](base_type* ptr) -> CharT { return ptr->front(); }, data_ptr);
4953
}
5054

5155
val<CharT> back() {
52-
return invoke(+[](base_type* ptr) -> CharT { return ptr->back(); }, data_ptr);
56+
return invoke(
57+
+[](base_type* ptr) -> CharT { return ptr->back(); }, data_ptr);
5358
}
5459

5560
val<CharT*> data() {
56-
return invoke(+[](base_type* ptr) -> CharT* { return ptr->data(); }, data_ptr);
61+
return invoke(
62+
+[](base_type* ptr) -> CharT* { return ptr->data(); }, data_ptr);
5763
}
5864

5965
/**
@@ -63,55 +69,62 @@ class val<std::basic_string<CharT, Traits>> {
6369
* character after the last position.
6470
*/
6571
val<const CharT*> c_str() const {
66-
return invoke(+[](base_type* ptr) -> const CharT* { return ptr->c_str(); }, data_ptr);
72+
return invoke(
73+
+[](base_type* ptr) -> const CharT* { return ptr->c_str(); }, data_ptr);
6774
}
6875

6976
operator val<std::basic_string_view<CharT, Traits>>() const {
7077
return val<std::basic_string_view<CharT, Traits>>(c_str());
7178
}
7279

7380
val<bool> empty() const {
74-
return invoke(+[](base_type* ptr) -> bool { return ptr->empty(); }, data_ptr);
81+
return invoke(
82+
+[](base_type* ptr) -> bool { return ptr->empty(); }, data_ptr);
7583
}
7684

7785
/**
7886
* Returns the number of CharT elements in the string, i.e. std::distance(begin(), end()).
7987
* @return
8088
*/
8189
val<size_type> size() const {
82-
return invoke(+[](base_type* ptr) -> size_type { return ptr->size(); }, data_ptr);
90+
return invoke(
91+
+[](base_type* ptr) -> size_type { return ptr->size(); }, data_ptr);
8392
}
8493

8594
/**
8695
* Returns the number of CharT elements in the string, i.e. std::distance(begin(), end()).
8796
* @return
8897
*/
8998
val<size_type> length() const {
90-
return invoke(+[](base_type* ptr) -> size_type { return ptr->length(); }, data_ptr);
99+
return invoke(
100+
+[](base_type* ptr) -> size_type { return ptr->length(); }, data_ptr);
91101
}
92102

93103
/**
94104
* Returns the maximum number of characters
95105
* @return
96106
*/
97107
val<size_type> max_size() const {
98-
return invoke(+[](base_type* ptr) -> size_type { return ptr->max_size(); }, data_ptr);
108+
return invoke(
109+
+[](base_type* ptr) -> size_type { return ptr->max_size(); }, data_ptr);
99110
}
100111

101112
/**
102113
* Informs a std::basic_string object of a planned change in size, so that it can manage the storage allocation appropriately.
103114
* @param new_cap
104115
*/
105116
void reserve(val<size_type> new_cap) const {
106-
return invoke(+[](base_type* ptr, size_type s) -> void { return ptr->reserve(s); }, data_ptr, new_cap);
117+
return invoke(
118+
+[](base_type* ptr, size_type s) -> void { return ptr->reserve(s); }, data_ptr, new_cap);
107119
}
108120

109121
/**
110122
* Returns the number of characters that can be held in currently allocated storage.
111123
* @return
112124
*/
113125
val<size_type> capacity() const {
114-
return invoke(+[](base_type* ptr) -> size_type { return ptr->capacity(); }, data_ptr);
126+
return invoke(
127+
+[](base_type* ptr) -> size_type { return ptr->capacity(); }, data_ptr);
115128
}
116129

117130
/**
@@ -120,91 +133,103 @@ class val<std::basic_string<CharT, Traits>> {
120133
* @return
121134
*/
122135
void clear() const {
123-
invoke(+[](base_type* ptr) -> void { ptr->clear(); }, data_ptr);
136+
invoke(
137+
+[](base_type* ptr) -> void { ptr->clear(); }, data_ptr);
124138
}
125139

126140
/**
127141
* Inserts characters into the string.
128142
*/
129143
auto& insert(val<size_type> index, val<size_type> count, val<CharT> ch) const {
130-
invoke(+[](base_type* ptr, size_type index, size_type count, CharT ch) -> void { ptr->insert(index, count, ch); }, data_ptr, index, count, ch);
144+
invoke(
145+
+[](base_type* ptr, size_type index, size_type count, CharT ch) -> void { ptr->insert(index, count, ch); }, data_ptr, index, count, ch);
131146
return *this;
132147
}
133148

134149
/**
135150
* Inserts characters into the string.
136151
*/
137152
auto& insert(val<size_type> index, val<const CharT*> s) const {
138-
invoke(+[](base_type* ptr, size_type index, const CharT* s) -> void { ptr->insert(index, s); }, data_ptr, index, s);
153+
invoke(
154+
+[](base_type* ptr, size_type index, const CharT* s) -> void { ptr->insert(index, s); }, data_ptr, index, s);
139155
return *this;
140156
}
141157

142158
/**
143159
* Appends additional characters to the string.
144160
*/
145161
auto& append(val<size_type> count, val<CharT> ch) {
146-
invoke(+[](base_type* ptr, size_type count, CharT ch) -> void { ptr->append(count, ch); }, data_ptr, count, ch);
162+
invoke(
163+
+[](base_type* ptr, size_type count, CharT ch) -> void { ptr->append(count, ch); }, data_ptr, count, ch);
147164
return *this;
148165
}
149166

150167
/**
151168
* Appends additional characters to the string.
152169
*/
153170
auto& append(val<std::basic_string<CharT, Traits>>& str) {
154-
invoke(+[](base_type* ptr, base_type* other) -> void { ptr->append(*other); }, data_ptr, str.data_ptr);
171+
invoke(
172+
+[](base_type* ptr, base_type* other) -> void { ptr->append(*other); }, data_ptr, str.data_ptr);
155173
return *this;
156174
}
157175

158176
/**
159177
* Appends additional characters to the string.
160178
*/
161179
auto& append(const val<std::basic_string<CharT, Traits>>& str, const val<size_type>& pos, const val<size_type>& count) {
162-
invoke(+[](base_type* ptr, base_type* other, size_type pos, size_type count) -> void { ptr->append(*other, pos, count); }, data_ptr, str.data_ptr, pos, count);
180+
invoke(
181+
+[](base_type* ptr, base_type* other, size_type pos, size_type count) -> void { ptr->append(*other, pos, count); }, data_ptr, str.data_ptr, pos, count);
163182
return *this;
164183
}
165184

166185
/**
167186
* Appends additional characters to the string.
168187
*/
169188
auto& append(const val<std::basic_string<CharT, Traits>>& str, const val<size_type>& count) {
170-
invoke(+[](base_type* ptr, base_type* other, size_type count) -> void { ptr->append(*other, count); }, data_ptr, str.data_ptr, count);
189+
invoke(
190+
+[](base_type* ptr, base_type* other, size_type count) -> void { ptr->append(*other, count); }, data_ptr, str.data_ptr, count);
171191
return *this;
172192
}
173193

174194
/**
175195
* Appends additional characters to the string.
176196
*/
177197
auto& append(const val<CharT*>& s, const val<size_type>& count) {
178-
invoke(+[](base_type* ptr, CharT* s, size_type count) -> void { ptr->append(s, count); }, data_ptr, s, count);
198+
invoke(
199+
+[](base_type* ptr, CharT* s, size_type count) -> void { ptr->append(s, count); }, data_ptr, s, count);
179200
return *this;
180201
}
181202

182203
/**
183204
* Appends additional characters to the string.
184205
*/
185206
auto& append(const val<const CharT*>& s) {
186-
invoke(+[](base_type* ptr, const CharT* s) -> void { ptr->append(s); }, data_ptr, s);
207+
invoke(
208+
+[](base_type* ptr, const CharT* s) -> void { ptr->append(s); }, data_ptr, s);
187209
return *this;
188210
}
189211

190212
/**
191213
* Appends additional characters to the string.
192214
*/
193215
auto& operator+=(const val<std::basic_string<CharT, Traits>>& str) {
194-
invoke(+[](base_type* ptr, base_type* s) -> void { ptr->operator+=(*s); }, data_ptr, str.data_ptr);
216+
invoke(
217+
+[](base_type* ptr, base_type* s) -> void { ptr->operator+=(*s); }, data_ptr, str.data_ptr);
195218
return *this;
196219
}
197220

198221
/**
199222
* Appends additional characters to the string.
200223
*/
201224
auto& operator+=(const val<const CharT*>& s) {
202-
invoke(+[](base_type* ptr, const CharT* s) -> void { ptr->operator+=(s); }, data_ptr, s);
225+
invoke(
226+
+[](base_type* ptr, const CharT* s) -> void { ptr->operator+=(s); }, data_ptr, s);
203227
return *this;
204228
}
205229

206230
auto& operator+=(const val<CharT*>& s) {
207-
invoke(+[](base_type* ptr, const CharT* s) -> void { ptr->operator+=(s); }, data_ptr, s);
231+
invoke(
232+
+[](base_type* ptr, const CharT* s) -> void { ptr->operator+=(s); }, data_ptr, s);
208233
return *this;
209234
}
210235

@@ -216,7 +241,8 @@ class val<std::basic_string<CharT, Traits>> {
216241
* Copies a substring [pos, pos + count) to character string pointed to by dest. If the requested substring lasts past the end of the string, or if count == npos, the copied substring is [pos, size()).
217242
*/
218243
val<size_type> copy(const val<CharT*>& dest, const val<size_type>& count, const val<size_type>& pos = 0) {
219-
return invoke(+[](base_type* ptr, CharT* dest, size_type count, size_type pos) -> size_type { return ptr->copy(dest, count, pos); }, data_ptr, dest, count, pos);
244+
return invoke(
245+
+[](base_type* ptr, CharT* dest, size_type count, size_type pos) -> size_type { return ptr->copy(dest, count, pos); }, data_ptr, dest, count, pos);
220246
}
221247

222248
~val() {

0 commit comments

Comments
 (0)