Create MLIR functions for ONNX operators that are functions #3409

andfau-amd · 2024-05-31T18:27:16Z

Resolves #3384.

Please read the commit message for more information!

I'd really appreciate any feedback! I am completely new to this. :)

andfau-amd · 2024-06-04T16:39:06Z

Marking this as ready for review now. check-torch-mlir, check-torch-mlir-python, and the CI (including e2e tests) work now. I'm reasonably confident in the quality of the code, and I would appreciate some feedback.

However, I probably need to add some kind of regression testing for this before it can be merged. I'm also going to run some external testing.

This might be a disruptive change, as discussed in the commit message. A different denylisting or allowlisting strategy might be worth discussing?

stellaraccident

First pass review. I feel like something complicated like this should also have a lit test that verifies at least a simple structural case case more explicitly.

See for example: https://github.com/llvm/torch-mlir/blob/main/test/python/onnx_importer/import_onnx_tool.runlit#L3

This can be annoyingly hard in ONNX, but if you can identify at least one simple example of function outlining like this, then it would be good to back it up with a lit test like that as it makes it obvious what the corresponding IR structure imports as. Will also make it easier to port to the C version.

projects/pt1/python/torch_mlir_e2e_test/configs/onnx_backend.py

python/torch_mlir/extras/onnx_importer.py

andfau-amd · 2024-06-04T18:00:36Z

I feel like something complicated like this should also have a lit test that verifies at least a simple structural case case more explicitly.

See for example: https://github.com/llvm/torch-mlir/blob/main/test/python/onnx_importer/import_onnx_tool.runlit#L3

This can be annoyingly hard in ONNX, but if you can identify at least one simple example of function outlining like this, then it would be good to back it up with a lit test like that as it makes it obvious what the corresponding IR structure imports as.

Oh, somehow I hadn't realised I could use LIT tests here. I can definitely add a few of those!

Will also make it easier to port to the C version.

Ah, right, do the two importers need to stay in sync? Ideally I would have wanted to implement a lot of this in C++, and then the two importers could perhaps share it, but I don't know if that's possible without putting that C++ into ONNX itself or something.

stellaraccident · 2024-06-04T18:06:01Z

I feel like something complicated like this should also have a lit test that verifies at least a simple structural case case more explicitly.
See for example: https://github.com/llvm/torch-mlir/blob/main/test/python/onnx_importer/import_onnx_tool.runlit#L3
This can be annoyingly hard in ONNX, but if you can identify at least one simple example of function outlining like this, then it would be good to back it up with a lit test like that as it makes it obvious what the corresponding IR structure imports as.

Oh, somehow I hadn't realised I could use LIT tests here. I can definitely add a few of those!

Will also make it easier to port to the C version.

Ah, right, do the two importers need to stay in sync? Ideally I would have wanted to implement a lot of this in C++, and then the two importers could perhaps share it, but I don't know if that's possible without putting that C++ into ONNX itself or something.

The C one gets used for ORT integration and is just starting point code for that use case. I've just aimed at keeping the importers relatively structural and eating the cost of keeping them in sync.

The python version can go a lot of places the C one can't and vica-versa. As an example, the python one doesn't infect every consuming project with a native protobuf/absl dependency. That itself pays for the code duplication.

andfau-amd · 2024-06-04T18:18:34Z

Hmm. As an aside, there's quite a few things in this code that are effectively working around deficiencies in the ONNX Python APIs. We could potentially simplify this Python importer a bit and make it easier to keep it in sync with the C++ importer if some improvements and additions to ONNX could be upstreamed.

python/torch_mlir/extras/onnx_importer.py

stellaraccident · 2024-06-04T21:47:49Z

Hmm. As an aside, there's quite a few things in this code that are effectively working around deficiencies in the ONNX Python APIs. We could potentially simplify this Python importer a bit and make it easier to keep it in sync with the C++ importer if some improvements and additions to ONNX could be upstreamed.

Example? Most of the messiness I've seen is just silly twenty year old protobuf-is-really-not-a-great-ir issues. Most of that stuff is just never going to be very good.

python/torch_mlir/extras/onnx_importer.py

stellaraccident · 2024-06-05T15:00:07Z

I hear you. But in various downstreams, we are basically never going to take a dep on protobuf or onnx in the c++ code... My experience with Onnx on this stuff is limited, but it is essentially a clone of TF GraphDef, which I have a lot of sad history with. I hate to say it, but at some point you lose the "this could be better upstream" aspect and just try to ensure local sanity. For that, I just try to keep the mechanics contained to mostly write once importers that have reasonable tests.

But if you see a chance to make the actual onnx python API better, feel free to send them a patch. The protobuf API hasn't changed much in a decade and is what it is, though.

Onnx only releases quarterly, so it could be a while before you can actually use any contributions.

andfau-amd · 2024-06-05T15:10:43Z

Wait, how does the C++ importer avoid depending on ONNX's C++ libraries or Protobuf? (Maybe we should discuss this elsewhere.)

stellaraccident · 2024-06-05T15:37:58Z

Wait, how does the C++ importer avoid depending on ONNX's C++ libraries or Protobuf? (Maybe we should discuss this elsewhere.)

The c++ importer doesn't avoid it. The python importer does.

andfau-amd · 2024-06-05T15:39:57Z

Oh, well, the Python importer doesn't depend on C++ directly, but it does depend on the ONNX Python library, and that depends on ONNX's C++ code. But I think I understand: we can't add C++ code of our own to the importer, and adding stuff to ONNX would take a long time.

andfau-amd · 2024-06-05T15:45:48Z

Uh, but more importantly, we might be able to make the C++ version of this code a bit simpler than the Python one since there's more stuff exposed by ONNX for C++, I think.

andfau-amd · 2024-06-13T16:14:45Z

@stellaraccident I added some LIT tests in 79ea75b, do those look good? Could you take another look at the PR?

test/python/onnx_importer/function_expansion/GreaterOrEqual.runlit

python/torch_mlir/extras/onnx_importer.py

rsuderman · 2024-06-13T18:48:02Z

I am a little worried about seeing multiple compounding changes on top of the general supporting multiple function imports. I may not have full context but it would surprise me if we need to tweak the decompositions due to wanting additional function inclusion.

andfau-amd · 2024-06-13T21:12:26Z

I am a little worried about seeing multiple compounding changes on top of the general supporting multiple function imports. I may not have full context but it would surprise me if we need to tweak the decompositions due to wanting additional function inclusion.

We discussed this further privately. As I understand it, Rob's concern was about disruption being caused by the function visibility change, and by the function expansion applying to all operations by default (so, it would affect operations we already support and significantly change their lowering).

We already fixed the function visibility thing.

To solve the other problem we've decided to switch to an allowlisting approach: only a tiny set of operations will be expanded this way by default. In this patch I make that be just MeanVarianceNormalization (resolves nod-ai/SHARK-ModelDev#697) ~~and NegativeLogLikelihoodLoss (might remove this, see #3380 discussion)~~. This way there's no immediate disruption.

I guess there can be some tracking issue for gradually expanding the allowlist.

Resolves llvm#3384. Many ONNX operators are defined by functions and therefore could be expanded into simpler ONNX operations during importing, avoiding the need for tools downstream to support these operators directly. This commit adds this capability to onnx_importer.py. When importing a node, the schema for the node's operator is retrieved. If the schema provides a function for the operator, a specialized version for the node's types and attributes will be created and imported as an MLIR function with private visibility. An MLIR function call will then be emitted, instead of a normal operator node. Caching is used to avoid generating redundant functions within the same module. In order to avoid a disruptive change to the importer output for a large number of operators that already have TorchOnnxToTorch support, an allowlist strategy is used by default. With this commit, only one operator is allowlisted for expansion, MeanVarianceNormalization. However, many other operators can be correctly expanded by the current code, so hopefully the allowlist can be gradually extended. It is possible to disable the allowlist in the configuration, in which case all functions are expanded (useful for testing). Tools downstream of the importer may now need to do inlining when consuming the output of the importer, e.g.: cat imported.mlir | torch-mlir-opt --inline --convert-onnx-to-torch Explanations for subtle code changes: - Looking up the correct schema and function for an operator requires knowing the opset version. NodeImporter retrieves this from the opset imports on the ModelProto retained by the GraphInfo. Previously, the model_proto field on GraphInfo was None when importing a subgraph in import_regions, but this conflicts with the new need for opset version info. Since the apparent purpose of setting it to None was to control how GraphInfo generates its input map, a new flag is added to GraphInfo (is_subgraph) to control this behavior, so that the actual ModelProto can now be provided without breaking this. This also turned out to be useful for getting the Config via ModelInfo via GraphInfo. - Some operators' functions are context-dependent, which means the function definition depends on the types of the inputs. Therefore node importing now needs to look up the types of a node's inputs, not just its outputs as was the case previously. Consequently the operand to find_type_proto_for_name() may now be a graph input or initializer in some cases, so it has to be updated.

andfau-amd mentioned this pull request May 31, 2024

[ONNX] Systematically expand ONNX functions before conversion to Torch to avoid needing bespoke conversions #3384

Closed

andfau-amd marked this pull request as draft May 31, 2024 18:28

andfau-amd force-pushed the onnx-to-torch-function-expansion branch 2 times, most recently from 50060d7 to eefc1d4 Compare June 4, 2024 15:39

andfau-amd changed the title ~~WIP: Create MLIR functions for ONNX operators that are functions~~ Create MLIR functions for ONNX operators that are functions Jun 4, 2024

andfau-amd marked this pull request as ready for review June 4, 2024 16:35

stellaraccident reviewed Jun 4, 2024

View reviewed changes

projects/pt1/python/torch_mlir_e2e_test/configs/onnx_backend.py Show resolved Hide resolved

python/torch_mlir/extras/onnx_importer.py Outdated Show resolved Hide resolved

ScottTodd reviewed Jun 4, 2024

View reviewed changes

python/torch_mlir/extras/onnx_importer.py Show resolved Hide resolved

andfau-amd commented Jun 5, 2024

View reviewed changes

python/torch_mlir/extras/onnx_importer.py Outdated Show resolved Hide resolved

andfau-amd commented Jun 5, 2024

View reviewed changes

python/torch_mlir/extras/onnx_importer.py Outdated Show resolved Hide resolved

andfau-amd force-pushed the onnx-to-torch-function-expansion branch from eefc1d4 to 01570af Compare June 5, 2024 14:18

andfau-amd commented Jun 5, 2024

View reviewed changes

python/torch_mlir/extras/onnx_importer.py Outdated Show resolved Hide resolved

andfau-amd mentioned this pull request Jun 12, 2024

[ONNX] Add OnnxToTorch lowering for Onnx.NegativeLogLikelihoodLoss Op #3380

Merged

andfau-amd requested a review from stellaraccident June 13, 2024 16:14

andfau-amd mentioned this pull request Jun 13, 2024

MeanVarianceNormalization nod-ai/SHARK-ModelDev#697

Closed

rsuderman requested changes Jun 13, 2024

View reviewed changes

andfau-amd requested a review from rsuderman June 14, 2024 09:25

andfau-amd force-pushed the onnx-to-torch-function-expansion branch 2 times, most recently from cdfc7e1 to 66920b8 Compare June 14, 2024 10:53

andfau-amd force-pushed the onnx-to-torch-function-expansion branch from 66920b8 to 46e39ee Compare June 14, 2024 16:19

rsuderman approved these changes Jun 14, 2024

View reviewed changes

stellaraccident approved these changes Jun 14, 2024

View reviewed changes

rsuderman merged commit 51902ec into llvm:main Jun 14, 2024
3 checks passed

andfau-amd mentioned this pull request Jun 14, 2024

[ONNX] Expand more ONNX operators as functions when importing #3464

Open

andfau-amd mentioned this pull request Jul 8, 2024

[ONNX] Bernoulli operator implementation might be wrong (value mismatch in e2e testing when using function expansion) #3527

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create MLIR functions for ONNX operators that are functions #3409

Create MLIR functions for ONNX operators that are functions #3409

andfau-amd commented May 31, 2024 •

edited

Loading

andfau-amd commented Jun 4, 2024

stellaraccident left a comment

andfau-amd commented Jun 4, 2024

stellaraccident commented Jun 4, 2024

andfau-amd commented Jun 4, 2024 •

edited

Loading

stellaraccident commented Jun 4, 2024

stellaraccident commented Jun 5, 2024 •

edited

Loading

andfau-amd commented Jun 5, 2024 •

edited

Loading

stellaraccident commented Jun 5, 2024

andfau-amd commented Jun 5, 2024

andfau-amd commented Jun 5, 2024

andfau-amd commented Jun 13, 2024 •

edited

Loading

rsuderman commented Jun 13, 2024

andfau-amd commented Jun 13, 2024 •

edited

Loading

Create MLIR functions for ONNX operators that are functions #3409

Create MLIR functions for ONNX operators that are functions #3409

Conversation

andfau-amd commented May 31, 2024 • edited Loading

andfau-amd commented Jun 4, 2024

stellaraccident left a comment

Choose a reason for hiding this comment

andfau-amd commented Jun 4, 2024

stellaraccident commented Jun 4, 2024

andfau-amd commented Jun 4, 2024 • edited Loading

stellaraccident commented Jun 4, 2024

stellaraccident commented Jun 5, 2024 • edited Loading

andfau-amd commented Jun 5, 2024 • edited Loading

stellaraccident commented Jun 5, 2024

andfau-amd commented Jun 5, 2024

andfau-amd commented Jun 5, 2024

andfau-amd commented Jun 13, 2024 • edited Loading

rsuderman commented Jun 13, 2024

andfau-amd commented Jun 13, 2024 • edited Loading

andfau-amd commented May 31, 2024 •

edited

Loading

andfau-amd commented Jun 4, 2024 •

edited

Loading

stellaraccident commented Jun 5, 2024 •

edited

Loading

andfau-amd commented Jun 5, 2024 •

edited

Loading

andfau-amd commented Jun 13, 2024 •

edited

Loading

andfau-amd commented Jun 13, 2024 •

edited

Loading