Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache Invocable and prevent concurrent access into orc #71

Merged
merged 4 commits into from
Dec 29, 2024

Conversation

ls-1801
Copy link
Contributor

@ls-1801 ls-1801 commented Dec 16, 2024

Whenever invoked, the CallableFunction wrapper locates the executable within the LLVM context.
This makes calling a function not thread-safe as the underlying LLVM implementation is not thread-safe and introduces an unnecessary overhead.

For small functions, this lookup dominates the runtime. Note: This benchmark only measures the invocation. Compilation is done beforehand.

static nautilus::val<int> smallFunction(nautilus::val<int> i) {
	return i * 2;
}
----------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations
----------------------------------------------------------------------
BM_smallFunctionCompiler          1187 ns         1183 ns       565135
BM_smallFunctionInterpreter       62.7 ns         62.5 ns     11918221

This PR caches the Invocable produced by the Compiler during construction of the CallableFunction and thus skips the lookup within the invocation.

----------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations
----------------------------------------------------------------------
BM_smallFunctionCompiler          3.65 ns         3.65 ns    191092600
BM_smallFunctionInterpreter       63.1 ns         63.1 ns     11229368

Note: we can remove the release mode and benchmark commit and only pick the relevant commit

@ls-1801 ls-1801 changed the title Cache Cache Invocable Dec 16, 2024
@ls-1801 ls-1801 force-pushed the ls-cache-executable branch 3 times, most recently from 33abb90 to a9224f5 Compare December 16, 2024 21:37
@ls-1801 ls-1801 changed the title Cache Invocable Cache Invocable and prevent concurrent access into orc Dec 17, 2024
@PhilippGrulich
Copy link
Member

I think the implementation works. We could also implement the caching in the MLIR executable, but I think both is fine.
I like the benchmarks, but please use Catch 2 Benchmark instead of Google Benchmark (and remove the module).
It would be cool to extend TracingBenchmark module to also have your benchmarks in https://nebulastream.github.io/nautilus/dev/bench/

@PhilippGrulich
Copy link
Member

We could also just commit in this change the caching and extend the benchmarks in this PR. #73
In general, I think it would be cool to have execution benchmarks for a wider set of functions.

I am not sure why, but I cannot compile nautilus in
release mode locally.
Apparently the internals of the orc jit are not thread-safe:
https://lists.llvm.org/pipermail/llvm-dev/2017-November/119108.html

The thread sanitizer in nebulastream complaints about a data-race deep within orc. The mailing lists suggests using a lock around access into orc.
@PhilippGrulich PhilippGrulich merged commit 542d8af into main Dec 29, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants