Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Relaxed Operator Fusion #37

Merged
merged 3 commits into from
Nov 3, 2023
Merged

Implement Relaxed Operator Fusion #37

merged 3 commits into from
Nov 3, 2023

Conversation

wagjamin
Copy link
Owner

@wagjamin wagjamin commented Nov 3, 2023

This PR implements Relaxed Operator Fusion in InkFuse. We extend all
existing tests to now also test our ROF backend.

Incremental Fusion conceptually supports relaxed operator fusion. The tuple
buffers that we can install between arbitrary suboperators are similar
to ROF staging points.

This makes it relatively easy to implement ROF in InkFuse.
The main modifications are all found in the PipelineExecutor.

At a high-level, ROF in InkFuse goes through the following steps:

  1. A Suboperator can attach a ROFStrategy in its
    OptimizationProperties. This indicates whether the suboperator
    prefers compilation or vectorization.
  2. The optimization properties become easy to use through an
    ROFScopeGuard in Pipeline.h. When we decay an operator into
    suboperators, this ROFScopeGuard can simply be set up during
    suboperator creation and indicates that the generated suboperators
    should all be vectorized.
  3. The PipelineExecutor now splits the topological order of the
    suboperators into maximum connected components that have
    vectorized or JIT preference. The JIT components are then compiled
    ahead of time.
  4. The ROF backend then iterates through the suboperator topological
    sort. Any interpreted comment uses the pre-compiled primitives. Any
    compiled component uses the JIT code.

We also fix some bugs that were uncovered along the way. The ROF pipelines stress
our code generation logic in interesting ways that we did not anticipate.
A good example is the second PR in the commit chain that fixed some issues with
how we generate code for filters.

This commit implements Relaxed Operator Fusion in InkFuse. We extend all
existing tests to now also test our ROF backend.

Incremental Fusion conceptually supports relaxed operator fusion. The tuple
buffers that we can install between arbitrary suboperators are similar
to ROF staging points.

This leads to a relatively small code change to integrate ROF in InkFuse.
The main modifications are all found in the `PipelineExecutor`.

At a high-level, ROF in InkFuse goes through the following steps:
1. A `Suboperator` can attach a `ROFStrategy` in its
   `OptimizationProperties`. This indicates whether the suboperator
   prefers compilation or vectorization.
2. The optimization properties become easy to use through an
   `ROFScopeGuard` in `Pipeline.h`. When we decay an operator into
   suboperators, this `ROFScopeGuard` can simply be set up during
   suboperator creation and indicates that the generated suboperators
   should all be vectorized.
3. The `PipelineExecutor` now splits the topological order of the
   suboperators into maximum connected components that have
   vectorized or JIT preference. The JIT components are then compiled
   ahead of time.
4. The ROF backend then iterates through the suboperator topological
   sort. Any interpreted comment uses the pre-compiled primitives. Any
   compiled component uses the JIT code.

In a next step, we will extend our benchmarking binaries with ROF
support and start performance measuring the ROF backend.
Our ROF implementation showed some correctness issues in how we
implement code generation for filters.

A filter looks as follows:
```
    /- IU Prov 1 ----------------> Filter 1 ---> Sink 1
Src                            /
    \- IU Prov 2 -> FilterScope ---> Filter 2 -> Sink 2
               \-----------------/
```

The IUs going from FilterScope to the filters are void-typed pseudo IUs.
The problem was that we could run into cases where FilterScope would
generate its nested `IF` before both IU providers were opened.

In those cases, we would generate the IU provider iterator only within
the nested filter scope, causing it to "lag" behind the other iterator,
producing incorrect results.

The core problem is that we do not model a code generation dependency
between `IU Prov 1 -> FilterScope`.

This commit ensures that all input IU providers generate their code
before the if is opened. Now, `Filter 1` and `Filter 2` only request
code generation of `FilterScope`, and `FilterScope` requests generating
both IU providers.

This is done in somewhat hacky way, if we rebuilt the system today we
should not model code generation dependencies through void-typed pseudo
IUs. We should instead probably model IU and codegen dependencies
separately.
@wagjamin wagjamin merged commit 111f77a into main Nov 3, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant