-
-
Notifications
You must be signed in to change notification settings - Fork 46
✨ Implement (controlled) swap reconstruction MLIR pass #1158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@burgholzer finally got this one (hopefully) working. Please also try to check if I have any logic errors in the test cases since at first I had them wrong since the CNOT pattern with the three consecutive CNOTs already performs the swap implicitly and thus no more swapping should be required (right?). Especially with the "advanced" test case I experimented a lot and I'm still not 100% sure it is correct now. Also, this breaks a little bit with the existing relationship between patterns and passes, do you think the "simple" and "advanced" approaches should be two separate passes? |
15488f9 to
611813e
Compare
burgholzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this.
I thought this would be a quick review, but I ended up having quite a couple of thoughts on this. I hope they make sense.
Based on the amount of comments on the implementation, I did not annotate the tests themselves with comments. I'll look at those in more detail as part of the next round of reviews.
mlir/lib/Dialect/MQTOpt/Transforms/SwapReconstructionPattern.cpp
Outdated
Show resolved
Hide resolved
mlir/lib/Dialect/MQTOpt/Transforms/SwapReconstructionPattern.cpp
Outdated
Show resolved
Hide resolved
mlir/lib/Dialect/MQTOpt/Transforms/SwapReconstructionPattern.cpp
Outdated
Show resolved
Hide resolved
mlir/lib/Dialect/MQTOpt/Transforms/SwapReconstructionPattern.cpp
Outdated
Show resolved
Hide resolved
mlir/lib/Dialect/MQTOpt/Transforms/SwapReconstructionPattern.cpp
Outdated
Show resolved
Hide resolved
mlir/lib/Dialect/MQTOpt/Transforms/SwapReconstructionPattern.cpp
Outdated
Show resolved
Hide resolved
|
Hey @taminob 👋 First of all, I think it is entirely valid to deviate from existing quantum compiler implementations in the way how passes are implemented. Most of these, and especially Qiskit, have historically grown quite significantly and have accumulated quite some technical dept. Furthermore, due to Qiskit's stability policy, making more fundamental changes to the underlying architecture has become quite hard; even when those rewrites would be fairly beneficial. Thus, we can afford ourselves some leniency here. As for the argument of multiple passes vs. one pass with multiple patterns: As far as I understand MLIR, I would almost consider individual patterns as the equivalent of (some/most) of the Qiskit passes. Patterns are applied in a fixed-point loop quite similarly to how Qiskit is applying its optimization passes until no improvement is achieved. As such, I believe it makes sense to group logically related rewrite patterns into a bunch of passes. In the example here, I believe it makes sense to combine patterns for
Each of these could easily consist of multiple patterns instead of a single one.
While writing this, I can't help but notice that, what we really want here is a collection of patterns that can be applied as part of device-agnostic optimization. The respective patterns will be applied over and over again in a fixed point loop until convergence, which concludes the target-independent rewrite pass. In the context of the reinforcement learning compiler optimizer, the job would then be to learn clever application sequences of these patterns so that the resulting circuit is as compact as possible. I have also read up quite a bit on the PDL Language (https://mlir.llvm.org/docs/PDLL/) and believe that it would really be worth to adopt it for defining patterns. It seems to eliminate basically all of the boilerplate code that we currently have to write and instead lets us focus on the actual pattern definitions. My proposal for next steps here:
|
|
@burgholzer thanks for all the effort you put into your answer and all the examples. I think we're actually basically on the same page, I probably just misused pattern/pass and thought of a pass what you describe as a pattern. I just didn't want to mash everything into one huge pattern. If we want to make larger passes, we might also need to reconsider the "apply greedily" approach which already didn't work perfectly for the two patterns I had here in this PR. Regarding the PDLL link you've already sent in the other PR, I also read into it and fully agree that it looks like exactly what we want (e.g. the remove identity and elide permutations patterns can be something like 3 lines each). I just wanted to discuss the next steps first with @flowerthrower on Tuesday. Last time we considered to already go on with the compiler pipeline after the elide permutations/swap reconstruction passes (or patterns, I guess) since that might already work as a PoC, but I'm open to discuss other next steps as well (we just need to keep the deadline in mind :) ). |
|
Makes sense 👍🏼 I would also be fine with deferring the usage of PDLL to a little later (maybe outside of your current project), but at the same time, I don't think it should require too much effort and might actually safe quite some effort in the future. Regarding the greedy application; this is supposed to work if it's about two patterns that are part of the same pass. This is also shown in the PDLL article here: https://www.jeremykun.com/2024/08/04/mlir-pdll/ This is probably a bit out-of-context for this issue here, but since I got some time on the plane I am on now: In my opinion, this is not quite yet where we would want to be for the compiler pipeline. In the compiler pipeline outlined in the MQT Predictor paper, we identified two key conditions that influence which state the compiler is currently in:
This spans the MDP state machine (with 4 states depending on the combinations of both conditions) used throughout the paper. In the actual implementation, this is slightly more complex. I'd summarize the general flow roughly as
I believe to capture all intricacies of a quantum compiler, we would almost have to have example implementations of passes for all these stages, each with at least a couple of patterns so that there is something to choose from. These initial implementations can be fairly naive and straight forward (as for example the trivial layout or routing pass). |
|
@burgholzer regarding the controlled swap reconstruction, I wasn't entirely sure and tried to look up the decomposition of a controlled swap. So the controlled swap reconstruction definitely is possible, but I think your example above is not fully correct (or I didn't interpret it correctly). |
|
That's the optimized decomposition. Thus, both reconstructions are feasible. |
611813e to
c621e99
Compare
b92a45d to
9d87d24
Compare
Co-authored-by: Lukas Burgholzer <burgholzer@me.com> Signed-off-by: Tamino Bauknecht <28907748+taminob@users.noreply.github.com>
|
@burgholzer if you have the time, this PR should be ready for a second round of feedback (but no hurry, I'm working on the other topics in the meantime). Some comments:
Also thanks again that you are taking the time for the discussions in this busy week! |
Before I start to look into the code in more detail, a quick comment on this one: mcx q[0] ctrl q[1] <controls: arbitrary other positive or negative controls>
mcx q[1] ctrl q[0] <controls>
# transforms to
swap q[0], q[1] <controls>
mcx q[0] ctrl q[1] <controls>
# or, better, because it saves one elision
mcx q[1] ctrl q[0] <controls>
swap q[1], q[0] <controls>Both of the decompositions would be feasible (as they are equivalent). However, the second case requires one less SWAP elision because it's the |
burgholzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some fairly quick feedback here. Take everything with a little grain of salt please.. it's been a long week 🙃
Overall, I'd be very much in favour of keeping this PR as simple as possible and feasible.
| } | ||
|
|
||
| def SwapReconstruction : Pass<"swap-reconstruction", "mlir::ModuleOp"> { | ||
| let summary = "This pass searches for CNOTs that can be merged into a SWAP."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not fully correct or at least does not cover the exact functionality of the pass.
| def SwapReconstruction : Pass<"swap-reconstruction", "mlir::ModuleOp"> { | ||
| let summary = "This pass searches for CNOTs that can be merged into a SWAP."; | ||
| let description = [{ | ||
| Three CNOTs that are equivalent to a SWAP are directly replaced with a (potentially controlled) SWAP gate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not fully correct. A CNOT, per definition, only has a single-control.
X gates with multiple controls are typically referred to as multi-controlled X gates or multi-controlled Toffoli gates.
I believe the general description here could be streamlined and clarified a little bit. I am happy to take a stab at this once the functionality is in the "right" place and moderately consolidated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for clearing that up, I guess I should update all of the comments to not use the term "CNOT" anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That depends in my opinion. If you are really talking about a CNOT (a single-controlled X gate), then the name makes sense and is quite established.
| ``` | ||
| ──■────■────■── ──■── | ||
| | ┌─┴─┐ | | | ||
| ──■──┤ X ├──■── => ──╳── | ||
| ┌─┴─┐└─┬─┘┌─┴─┐ | | ||
| ┤ X ├──■──┤ X ├ ──╳── | ||
| └───┘ └───┘ | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it would be helpful to also write down the other pattern where only the middle gate is controlled and the sandwiching gates are simple CNOTs.
| /** | ||
| * @brief This pattern attempts to find CNOT patterns which can be replaced by a | ||
| * SWAP gate. | ||
| * | ||
| * Example (onlyMatchFullSwapPattern=false): | ||
| * ┌───┐ ┌───┐ | ||
| * ──■──┤ X ├ ──■──┤ X ├──■────■── ──╳────■── | ||
| * ┌─┴─┐└─┬─┘ => ┌─┴─┐└─┬─┘┌─┴─┐┌─┴─┐ => | ┌─┴─┐ | ||
| * ┤ X ├──■── ┤ X ├──■──┤ X ├┤ X ├ ──╳──┤ X ├ | ||
| * └───┘ └───┘ └───┘└───┘ └───┘ | ||
| * | ||
| * Example (onlyMatchFullSwapPattern=false, matchControlledSwap=true): | ||
| * ──□────□── ──□────□────□────□── ──□────□── | ||
| * | ┌─┴─┐ | ┌─┴─┐ | | | | | ||
| * ──■──┤ X ├ => ──■──┤ X ├──■────■── => ──╳────■── | ||
| * ┌─┴─┐└─┬─┘ ┌─┴─┐└─┬─┘┌─┴─┐┌─┴─┐ | ┌─┴─┐ | ||
| * ┤ X ├──■── ┤ X ├──■──┤ X ├┤ X ├ ──╳──┤ X ├ | ||
| * └───┘ └───┘ └───┘└───┘ └───┘ | ||
| * | ||
| * Example (onlyMatchFullSwapPattern=true, matchControlledSwap=true): | ||
| * ──■────■────■── ──■── | ||
| * | ┌─┴─┐ | | | ||
| * ──■──┤ X ├──■── => ──╳── | ||
| * ┌─┴─┐└─┬─┘┌─┴─┐ | | ||
| * ┤ X ├──■──┤ X ├ ──╳── | ||
| * └───┘ └───┘ | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simplify the implementation for this pattern over here for now. We won't be using most of the multi-controlled patterns here for one.
Let's limit this to the patterns that involve two CNOTs with alternating controls.
These should be
Pos-Pos
x q[0] ctrl q[1];
x q[1] ctrl q[0];to
x q[1] ctrl q[0];
swap q[1], q[0];or, as indicated by the other comment,
x q[1] ctrl q[0];
# swap the output qubit SSA values of the CNOTNeg-Neg
x q[0] negctrl q[1];
x q[1] negctrl q[0];to (resolve negctrl q[1])
x q[1];
x q[0] ctrl q[1];
x q[1];
x q[1] negctrl q[0];to (resolve negctrl q[0])
x q[1];
x q[0] ctrl q[1];
x q[1];
x q[0];
x q[1] ctrl q[0];
x q[0];to (commute X gates through targets)
x q[1];
x q[0];
x q[0] ctrl q[1];
x q[1] ctrl q[0];
x q[0];
x q[1];to (apply pos-pos pattern)
x q[1];
x q[0];
x q[1] ctrl q[0];
x q[0];
x q[1];
## exchange output SSA values of q[0] and q[1]to (cancel X gates through target of CNOT)
x q[0];
x q[1] ctrl q[0];
x q[0];
## exchange output SSA values of q[0] and q[1]to (combine back to negative control)
x q[1] negctrl q[0];
## exchange output SSA values of q[0] and q[1]which gives you the final replacement for the original pattern match.
Pos-Neg
Similar transformations as for the Neg-Neg pattern.
x q[0] ctrl q[1];
x q[1] negctrl q[0];to (resolve negctrl q[0])
x q[0] ctrl q[1];
x q[0];
x q[1] ctrl q[0];
x q[0];to (commute X gate through targets)
x q[0];
x q[0] ctrl q[1];
x q[1] ctrl q[0];
x q[0];to (apply pos-pos pattern)
x q[0];
x q[1] ctrl q[0];
swap q[0], q[1];
x q[0]; to (apply elide swap)
x q[0];
x q[1] ctrl q[0];
x q[1];
## exchange output SSA values of q[0] and q[1]either one is done here (valid); or one transforms further. Possible options:
x q[1] negctrl q[0];
x q[0];
x q[1];
## exchange output SSA values of q[0] and q[1]or
x q[0];
x q[1];
x q[1] ctrl q[0];
## exchange output SSA values of q[0] and q[1]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we agreed that multi-controlled swap has some value and it is already implemented - why throw it away? :)
If you want to keep this PR simple, I'd rather push the Neg-Pos and Neg-Neg cases to a new PR instead and finalize this one first.
Splitting the reconstruction into onlyMatchFullSwapPattern and matchControlledSwap was actually just done for better readability and it could also be done with just one bool which is checked in both places instead. However, this more fine-grained control made the intention clearer in my opinion.
Or are we talking at cross-purposes again and should clear this up in a quick meeting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding directly eliding the swaps in this pass, I guess we are in disagreement about the extend of a single pattern. :)
I liked exactly the combination of the two patterns and keep the cognitive load per pattern low. But I can also directly elide the swap in this pattern if that's what you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Finalizing this PR first is definitely a good idea. Pushing the alternative patterns out to a separate PR seems reasonable to me.
- I guess the only thing I dislike about the multi-controlled swap handling is that it makes this extremely simple pass much more complicated in code because there are many different cases to be handled. If we were to use PDLL, the pattern for the simple swap reconstruction would probably be a one liner (or a handful of lines).
- My main concern for considering the elision as part of the pattern is performance. If we know already that this optimization opportunity arises and that we 100% of the time want to apply it, why should we delay the decision for that to the pattern matcher/rewriter? If we implement it right here, its cost is constant and close to zero. If the pattern rewriter has to run over the program that's at least
O(n)with regard to the number of operations, which seems unnecessary to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, that clears it up a bit.
- I'll start working on that once we have this merged.
- Yes, multi-controlled handling would be (probably) pretty annoying in PDLL - that's also why I said that I'm not sure if everything becomes that simple when we switch to PDLL. But I also think that the swap reconstruction would become a lot easier (and yes, I am getting tempted to look into integrating PDLL).
- Understandable, that might indeed be a problem for larger circuits... Do you have a suggestion for a name for this pass?
SwapReconstructionAndElision?
So the conclusion is that I'll remove everything regarding the multi-controlled stuff? I am just a little bit sad that I spent (quite some) time on implementing it. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just slight clarifications:
- Nice.
- I am not sure if it becomes any worse with PDLL than it is right now. It might not get easier, but I also see no indication that it might get harder.
- Hm. At that point, one might really simply add this pattern to the permutation elision pass and rename that pass to
SwapReconstructionAndElision. This is still equally well testable as the separate pass setup at the moment. The new tests would just become additional tests in the existing test suite for the elision pass. - I would not remove the multi-controlled stuff entirely; would be a waste of your effort. I'd just also postpone it to a separate PR because it is not yet 100% clear to me where to put this and how to integrate it properly, while it becomes clearer and clearer how to properly integrate the simple swap reconstruction and elision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so I'll have a quick look into PDLL and how to integrate it into the build system. Since this PR needs basically a full rewrite I could also use it as a PoC for a PDLL pattern and do that in a separate PR. Then we can simply close this one with the current state to not lose the multi-control implementation (or at least the logic of it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a plan to me 👍🏼
|
As discussed before, I opened a new PR (#1207) for the simplified implementation with reduced functionality. |
## Description This PR introduces a new pattern and pass for swap reconstruction from two consecutive reversed CNOT gates and its immediate elision. It replaces #1158 which implemented a more advanced, but also much more complex, variant of the pattern which can also re-construct controlled SWAP gates and replace full three CNOT patterns equivalent to a SWAP. It is equivalent in functionality to #1194 which implemented the same pattern in PDLL. Related to #1122 ## Checklist: <!--- This checklist serves as a reminder of a couple of things that ensure your pull request will be merged swiftly. --> - [x] The pull request only contains commits that are focused and relevant to this change. - [x] I have added appropriate tests that cover the new/changed functionality. - [x] I have updated the documentation to reflect these changes. - [x] I have added entries to the changelog for any noteworthy additions, changes, fixes, or removals. - [x] I have added migration instructions to the upgrade guide (if needed). - [x] The changes follow the project's style guidelines and introduce no new warnings. - [x] The changes are fully tested and pass the CI checks. - [x] I have reviewed my own code changes. --------- Signed-off-by: burgholzer <burgholzer@me.com> Signed-off-by: Lukas Burgholzer <burgholzer@me.com> Co-authored-by: burgholzer <burgholzer@me.com>

Description
Add new MLIR pass for swap reconstruction. This pass checks for three consecutive CNOTs that are equivalent to a SWAP and replaces them with a SWAP operation.
In combination with elide permutations (#1151) this can be used to eliminate CNOTs.
Additionally, the pass adds two self-negating CNOTs if only two of the required CNOTs are found in order to have 2 CNOTs -> CNOT + SWAP which in combination with the elide permutations pass will still be better than before. (not yet implemented)
Related to #1122
Checklist: