Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shark_turbine 0.9.3 Breaks a Resnet-18 Model #282

Open
zjgarvey opened this issue Dec 20, 2023 · 6 comments
Open

Shark_turbine 0.9.3 Breaks a Resnet-18 Model #282

zjgarvey opened this issue Dec 20, 2023 · 6 comments

Comments

@zjgarvey
Copy link
Collaborator

Using the dependencies:

pip install shark_turbine==0.9.2
pip install transformers

the following python code will successfully execute:

from transformers import AutoModelForImageClassification
import torch
from shark_turbine.aot import *

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-18")

def forward(pixel_values_tensor: torch.Tensor):
    with torch.no_grad():
        logits = model.forward(pixel_values_tensor).logits
    predicted_id = torch.argmax(logits, -1)
    return predicted_id

class RN18(CompiledModule):
    params = export_parameters(model)

    def forward(self, x=AbstractTensor(None, 3, 224, 224, dtype=torch.float32)):
        const = [x.dynamic_dim(0) < 16]
        return jittable(forward)(x, constraints=const)

exported = export(RN18)
compiled_binary = exported.compile(save_to=None)

However, using the newer release of shark_turbine (0.9.3) and running the same python script will result in an error message.

ElementsAttr does not provide iteration facilities for type `mlir::Attribute`, see attribute: dense_resource<torch_tensor_64_torch.float32> : tensor<64xf32>
Trace/breakpoint trap

Related to #268

@zjgarvey
Copy link
Collaborator Author

I'll see if I can figure out how to get more a more helpful stack trace for this smaller reproduction.

In the meantime, here is the result of running a resnet-18 unit test in #268 :

self = <tests.resnet_test.Resnet18Test testMethod=testExportResnet18Model>

    def testExportResnet18Model(self):
        with self.assertRaises(SystemExit) as cm:
>           resnet_18.export_resnet_18_model(
                resnet_model,
                "vmfb",
                "cpu",
            )

python/turbine_models/tests/resnet_test.py:22: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
python/turbine_models/custom_models/resnet_18.py:69: in export_resnet_18_model
    utils.compile_to_vmfb(module_str, device, target_triple, max_alloc, "resnet_18")
python/turbine_models/custom_models/sd_inference/utils.py:75: in compile_to_vmfb
    flatbuffer_blob = ireec.compile_str(
examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/core.py:299: in compile_str
    result = invoke_immediate(cl, immediate_input=input_bytes)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def invoke_immediate(
        command_line: List[str], *, input_file: Optional[bytes] = None, immediate_input=None
    ):
        """Invokes an immediate command.
    
        This is separate from invoke_pipeline as it is simpler and supports more
        complex input redirection, using recommended facilities for sub-processes
        (less magic).
    
        Note that this differs from the usual way of using subprocess.run or
        subprocess.Popen().communicate() because we need to pump all of the error
        streams individually and only pump pipes not connected to a different stage.
        Uses threads to pump everything that is required.
        """
        if logger.isEnabledFor(logging.INFO):
            logging.info("Invoke IREE Tool: %s", _quote_command_line(command_line))
        run_args = {}
        input_file_handle = None
        stderr_handle = sys.stderr
        try:
            # Redirect input.
            if input_file is not None:
                input_file_handle = open(input_file, "rb")
                run_args["stdin"] = input_file_handle
            elif immediate_input is not None:
                run_args["input"] = immediate_input
    
            process = subprocess.run(command_line, capture_output=True, **run_args)
            if process.returncode != 0:
>               raise CompilerToolError(process)
E               iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile
E               Error code: -5
E               Diagnostics:
E               ElementsAttr does not provide iteration facilities for type `mlir::Attribute`, see attribute: dense_resource<torch_tensor_64_torch.float32_1> : tensor<64xf32>
E               Please report issues to https://github.com/openxla/iree/issues and include the crash backtrace.
E               Stack dump:
E               0.      Program arguments: /home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile - --iree-input-type=auto --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-input-type=torch --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-codegen-check-ir-before-llvm-conversion=false --iree-opt-const-expr-hoisting=False --iree-llvmcpu-enable-ukernels=all
E               Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
E               0  libIREECompiler.so 0x00007fe7e882ca77 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 39
E               1  libIREECompiler.so 0x00007fe7e882a7be llvm::sys::RunSignalHandlers() + 238
E               2  libIREECompiler.so 0x00007fe7e882d13f
E               3  libc.so.6          0x00007fe7e3ac4520
E               4  libIREECompiler.so 0x00007fe7e8776972
E               
E               
E               Invoked with:
E                iree-compile /home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile - --iree-input-type=auto --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-input-type=torch --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-codegen-check-ir-before-llvm-conversion=false --iree-opt-const-expr-hoisting=False --iree-llvmcpu-enable-ukernels=all
E               
E               Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.

examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/binaries.py:198: CompilerToolError

@IanNod
Copy link
Contributor

IanNod commented Dec 20, 2023

Would be helpful to look at the Torch IR that we are compiling through IREE that is giving that error and linking to it in a gist. You should be able to save that mlir with exported.save_mlir("mlir path"). The recommended path is generally to use the lower level CompiledModule as you can see in python/turbine_models/custom_models/stateless_llama.py which we save the mlir when the flag --compile_to=torch or linalg.

@stellaraccident
Copy link
Contributor

This is related to the switch from DenseElementsAttr -> DenseResourceElementsAttr. Avi and Sai did a pass through IREE and ported things to use a more generic interface but this must have been missed. DenseResourceElementsAttr does not support iteration through a generic FloatAttr.

A stack trace would isolate the failure point and then it will need to be fixed. We should get this model in the CI.

@zjgarvey
Copy link
Collaborator Author

Here's a gist containing the torch IR
Here's another gist containing the exported module's IR

Both IRs are generated from the following python code:

from transformers import AutoModelForImageClassification
import torch
from shark_turbine.aot import *
from iree.compiler.ir import Context
from iree.compiler.api import Session
import iree.runtime as rt

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-18")

def forward(pixel_values_tensor: torch.Tensor):
    with torch.no_grad():
        logits = model.forward(pixel_values_tensor).logits
    predicted_id = torch.argmax(logits, -1)
    return predicted_id

class RN18(CompiledModule):
    params = export_parameters(model,external=True)

    def forward(self, x=AbstractTensor(None, 3, 224, 224, dtype=torch.float32)):
        const = [x.dynamic_dim(0) < 16]
        return jittable(forward)(x, constraints=const)

inst = RN18(context=Context(), import_to="INPUT")
torch_str = str(CompiledModule.get_mlir_module(inst)) 

with open("resnet18.mlir", "w+") as f:
    f.write(torch_str)

session = Session()
ExportedModule = exporter.ExportOutput(session, inst)
mlir_module = str(ExportedModule.mlir_module)
with open("exported_resnet18.mlir", "w+") as f:
    f.write(mlir_module)

Then running the following will produce an error message:

import iree.compiler as ic

flags = [
    "--iree-input-type=torch",
    "--mlir-print-debuginfo",
    "--mlir-print-op-on-diagnostic=false",
    "--iree-llvmcpu-target-cpu-features=host",
    "--iree-llvmcpu-target-triple=x86_64-linux-gnu",
    "--iree-stream-resource-index-bits=64",
    "--iree-vm-target-index-bits=64",
    "--iree-codegen-check-ir-before-llvm-conversion=false",
    "--iree-opt-const-expr-hoisting=False",
    "--iree-llvmcpu-enable-ukernels=all",
]

flatbuffer_blob = ic.compile_str(
    torch_str,
    target_backends=["llvm-cpu"],
    extra_args=flags)

The error occurs upon invoking ic.compile_st:

Traceback (most recent call last):
  File "/home/zjgar/code/SHARK-Turbine/examples/resnet-18/minrepo.py", line 50, in <module>
    flatbuffer_blob = ic.compile_str(
                      ^^^^^^^^^^^^^^^
  File "/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/core.py", line 299, in compile_str
    result = invoke_immediate(cl, immediate_input=input_bytes)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/binaries.py", line 198, in invoke_immediate
    raise CompilerToolError(process)
iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile
Error code: -5
Diagnostics:
ElementsAttr does not provide iteration facilities for type `mlir::Attribute`, see attribute: dense_resource<torch_tensor_64_torch.float32_1> : tensor<64xf32>
Please report issues to https://github.com/openxla/iree/issues and include the crash backtrace.
Stack dump:
0.      Program arguments: /home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile - --iree-input-type=auto --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-input-type=torch --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-codegen-check-ir-before-llvm-conversion=false --iree-opt-const-expr-hoisting=False --iree-llvmcpu-enable-ukernels=all
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  libIREECompiler.so 0x00007fb69adfca77 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 39
1  libIREECompiler.so 0x00007fb69adfa7be llvm::sys::RunSignalHandlers() + 238
2  libIREECompiler.so 0x00007fb69adfd13f
3  libc.so.6          0x00007fb696094520
4  libIREECompiler.so 0x00007fb69ad46972


Invoked with:
 iree-compile /home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile - --iree-input-type=auto --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-input-type=torch --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-codegen-check-ir-before-llvm-conversion=false --iree-opt-const-expr-hoisting=False --iree-llvmcpu-enable-ukernels=all

Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.

@zjgarvey
Copy link
Collaborator Author

I'll follow the instructions on how to get more information from that error message and add another comment

@zjgarvey
Copy link
Collaborator Author

After following the error message instructions, the information in IREE_SAVE_TEMPS is saved in this gist

The error message with llvm-symbolizer linked:

Traceback (most recent call last):
  File "/home/zjgar/code/SHARK-Turbine/examples/resnet-18/minrepo.py", line 50, in <module>
    flatbuffer_blob = ic.compile_str(
                      ^^^^^^^^^^^^^^^
  File "/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/core.py", line 299, in compile_str
    result = invoke_immediate(cl, immediate_input=input_bytes)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/binaries.py", line 198, in invoke_immediate
    raise CompilerToolError(process)
iree.compiler.tools.binaries.CompilerToolError: Error invoking IREE compiler tool iree-compile
Error code: 1
Diagnostics:
ElementsAttr does not provide iteration facilities for type `mlir::Attribute`, see attribute: dense_resource<torch_tensor_64_torch.float32_1> : tensor<64xf32>
<stdin>:1:1: error: Failures have been detected while processing an MLIR pass pipeline
module @r_n18 {
^
<stdin>:1:1: note: Pipeline failed while executing [`mlir::iree_compiler::IREE::HAL::SerializeExecutablesPass` on 'hal.executable' operation: @r_n18_linked_llvm_cpu, `mlir::iree_compiler::IREE::HAL::SerializeTargetExecutablesPass` on 'hal.executable' operation: @r_n18_linked_llvm_cpu]: reproducer generated at `/home/zjgar/code/SHARK-Turbine/examples/resnet-18/iree_save/core-reproducer.mlir`


Invoked with:
 iree-compile /home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-compile - --iree-input-type=auto --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-embedded-linker-path=/home/zjgar/code/SHARK-Turbine/examples/resnet-18/err_venv/lib/python3.11/site-packages/iree/compiler/tools/../_mlir_libs/iree-lld --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --mlir-pass-pipeline-crash-reproducer=/home/zjgar/code/SHARK-Turbine/examples/resnet-18/iree_save/core-reproducer.mlir --iree-input-type=torch --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-llvmcpu-target-triple=x86_64-linux-gnu --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-codegen-check-ir-before-llvm-conversion=false --iree-opt-const-expr-hoisting=False --iree-llvmcpu-enable-ukernels=all

Need more information? Set IREE_SAVE_TEMPS=/some/dir in your environment to save all artifacts and reproducers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants