Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linalg.matmul E2E #1079

Open
johnmatter opened this issue Nov 7, 2024 · 6 comments
Open

linalg.matmul E2E #1079

johnmatter opened this issue Nov 7, 2024 · 6 comments

Comments

@johnmatter
Copy link

Context

Suppose I have the following code (essentially this test file pre-secretization):

module {
  func.func @main(%vec : tensor<1x4xf16>) -> tensor<1x4xf16> {
    %matrix = arith.constant dense<[[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0], [13.0, 14.0, 15.0, 16.0]]> : tensor<4x4xf16>
    %bias = arith.constant dense<[[17.0, 18.0, 19.0, 20.0]]> : tensor<1x4xf16>
    %0 = linalg.matmul ins(%vec, %matrix : tensor<1x4xf16>, tensor<4x4xf16>) outs(%bias : tensor<1x4xf16>) -> tensor<1x4xf16>
    return %0 : tensor<1x4xf16>
  }
}

I'm trying to run the following:

bazel run --action_env=CC=$(which gcc) //tools:heir-opt -- --mlir-to-openfhe-ckks handwritten_linalg_matmul.mlir

Here's the output I get:

/home/ubuntu/simplemlir/handwritten_linalg_matmul.mlir:1:1: warning: expected secret types to be tensors with dimension matching ring parameter, pass will not pack tensors into ciphertext SIMD slots
module {
^
handwritten_linalg_matmul.mlir:2:3: error: failed to legalize operation 'secret.generic' that was explicitly marked illegal
  func.func @main(%vec : tensor<1x4xf16>) -> tensor<1x4xf16> {
  ^
handwritten_linalg_matmul.mlir:2:3: note: see current operation: 
%10 = "secret.generic"(<<UNKNOWN SSA VALUE>>) ({
^bb0(%arg14: tensor<1x4xf16>):
  %31 = "arith.mulf"(%arg14, %7) <{fastmath = #arith.fastmath<none>}> : (tensor<1x4xf16>, tensor<1x4xf16>) -> tensor<1x4xf16>
  "secret.yield"(%31) : (tensor<1x4xf16>) -> ()
}) : (!secret.secret<tensor<1x4xf16>>) -> !secret.secret<tensor<1x4xf16>>

Question

Is there something I should be doing regarding cipher text dimension or flattening tensors before running this pipeline?

@j2kun
Copy link
Collaborator

j2kun commented Nov 8, 2024

Right now the end to end pipelines require you to hard-code the ciphertext size (we haven't yet broached how to automatically pick ciphertext sizes, since that is intertwined with questions about packing), so it would be

--mlir-to-openfhe-ckks='entry-function=main ciphertext-degree=4'

I can run this and get openfhe dialect output. You still need to run it through //tools:heir-translate -- --emit-openfhe-pke --openfhe scheme=ckks and the build and compile the resulting C++ code. You can see an example (though in Bazel) here:

openfhe_end_to_end_test(
name = "halevi_shoup_matmul_test",
generated_lib_header = "halevi_shoup_matmul_lib.h",
heir_opt_flags = ["--mlir-to-openfhe-ckks=entry-function=matmul ciphertext-degree=16"],
heir_translate_flags = ["--openfhe-scheme=ckks"],
mlir_src = "halevi_shoup_matmul.mlir",
tags = ["notap"],
test_src = "halevi_shoup_matmul_test.cpp",
)

@j2kun
Copy link
Collaborator

j2kun commented Nov 8, 2024

ICYMI, we do have a tutorial that shows most of this here: https://heir.dev/docs/getting_started/#using-heir

It uses openfhe with BGV, not CKKS, but the difference should be that extra openfhe-scheme=ckks option and the differences in the main driver file.

@johnmatter
Copy link
Author

Thank you Jeremy! The MLIR I originally posted successfully lowers to a cpp and h file with that pipeline.

Starting from a slightly different point, tensorflow converts a tf.keras.Sequential with a single Dense layer into the following:

module {
  func.func @main(%arg0: tensor<1x4xf32>) -> tensor<1x4xf32> attributes {allow_soft_placement = false} {
    %cst = stablehlo.constant dense<[[0.433081806, -0.279845297, 0.144051015, 0.631222903], [-0.708569884, 0.407131732, 0.0975818634, 0.0837838649], [-0.0765365362, -0.256991029, 0.505359352, 0.197199166], [-2.625500e-01, 0.420378864, -0.273303688, 0.682159721]]> : tensor<4x4xf32>
    %0 = stablehlo.dot %arg0, %cst, precision = [DEFAULT, DEFAULT] : (tensor<1x4xf32>, tensor<4x4xf32>) -> tensor<1x4xf32>
    return %0 : tensor<1x4xf32>
  }
}

From here, I am running the following passes:

  • stablehlo-opt --stablehlo-legalize-to-linalg
  • heir-opt --secretize='entry-function=main' --wrap-generic
  • heir-opt --linalg-to-tensor-ext

This yields (collapsed for brevity):

after `--linalg-to-tensor-ext`
module {
  func.func @main(%arg0: !secret.secret<tensor<1x4xf32>>) -> !secret.secret<tensor<1x4xf32>> attributes {allow_soft_placement = false} {
    %c1 = arith.constant 1 : index
    %cst = arith.constant dense<[[0.433081806, 0.407131732, 0.505359352, 0.682159721], [-0.708569884, -0.256991029, -0.273303688, 0.631222903], [-0.0765365362, 0.420378864, 0.144051015, 0.0837838649], [-2.625500e-01, -0.279845297, 0.0975818634, 0.197199166]]> : tensor<4x4xf32>
    %cst_0 = arith.constant 0.000000e+00 : f32
    %0 = secret.generic ins(%arg0 : !secret.secret<tensor<1x4xf32>>) {
    ^bb0(%arg1: tensor<1x4xf32>):
      %1 = tensor.empty() : tensor<1x4xf32>
      %2 = linalg.fill ins(%cst_0 : f32) outs(%1 : tensor<1x4xf32>) -> tensor<1x4xf32>
      %3:2 = affine.for %arg2 = 0 to 3 iter_args(%arg3 = %2, %arg4 = %arg1) -> (tensor<1x4xf32>, tensor<1x4xf32>) {
        %extracted_slice_1 = tensor.extract_slice %cst[%arg2, 0] [1, 4] [1, 1] : tensor<4x4xf32> to tensor<1x4xf32>
        %6 = arith.mulf %arg4, %extracted_slice_1 : tensor<1x4xf32>
        %7 = arith.addf %arg3, %6 : tensor<1x4xf32>
        %8 = tensor_ext.rotate %arg4, %c1 : tensor<1x4xf32>, index
        affine.yield %7, %8 : tensor<1x4xf32>, tensor<1x4xf32>
      }
      %extracted_slice = tensor.extract_slice %cst[3, 0] [1, 4] [1, 1] : tensor<4x4xf32> to tensor<1x4xf32>
      %4 = arith.mulf %3#1, %extracted_slice : tensor<1x4xf32>
      %5 = arith.addf %3#0, %4 : tensor<1x4xf32>
      secret.yield %5 : tensor<1x4xf32>
    } -> !secret.secret<tensor<1x4xf32>>
    return %0 : !secret.secret<tensor<1x4xf32>>
  }
}

From here, heir-opt --mlir-to-openfhe-ckks='entry-function=main ciphertext-degree=4' gives me the following error:

03_linalg_to_tensor_ext.mlir:6:10: error: failed to legalize operation 'secret.generic' that was explicitly marked illegal
    %0 = secret.generic ins(%arg0 : !secret.secret<tensor<1x4xf32>>) {
         ^
03_linalg_to_tensor_ext.mlir:6:10: note: see current operation: 
%12 = "secret.generic"(%11) ({
^bb0(%arg16: tensor<1x4xf32>):
  %35 = "linalg.fill"(%9, %arg16) <{operandSegmentSizes = array<i32: 1, 1>}> ({
  ^bb0(%arg17: f32, %arg18: f32):
    "linalg.yield"(%arg17) : (f32) -> ()
  }) : (f32, tensor<1x4xf32>) -> tensor<1x4xf32>
  "secret.yield"(%35) : (tensor<1x4xf32>) -> ()
}) : (!secret.secret<tensor<1x4xf32>>) -> !secret.secret<tensor<1x4xf32>>

The only difference I notice between this and my first MLIR example is the linalg.fill and tensor.empty lines. Is the issue that I end up with a named linalg op inside the secret.generic?

@j2kun
Copy link
Collaborator

j2kun commented Nov 14, 2024

As far as I am aware, the only special purpose linalg op we lower in that pass is matmul, and we don't yet support lowering linalg.generic that you could generalize fill to via [-linalg-generalize-named-ops](https://mlir.llvm.org/docs/Passes/#-linalg-generalize-named-ops. So far we rely on upstream passes that lower linalg to loops.

You could try -convert-linalg-to-affine-loops to get rid of the linalg.fill, since we should be able to support what that lowers to. You may need to bufferize to lower linalg to loops though (-eliminate-empty-tensors + -one-shot-bufferize might be enough, otherwise you can check out

void oneShotBufferize(OpPassManager &manager) {
and https://mlir.llvm.org/docs/Bufferization/).

As a side note, you should be able to run the mlir-to-openfhe-ckks (with appropriate options) right after the initial stablehlo command above (see https://github.com/google/heir/blob/main/lib/Pipelines/ArithmeticPipelineRegistration.cpp#L78-L86 that it contains all the other commands you ran as passes). So if there's another error you ran into while trying the end-to-end pipeline, let me know.

@asraa what do you think about taking the output from stablehlo-opt --stablehlo-legalize-to-linalg above and adding that as an example test case? Then we can iron out the kinks and have a clearer story for the future?

@ZenithalHourlyRate
Copy link
Collaborator

I tried --convert-elementwise-to-linalg -linalg-generalize-named-ops -eliminate-empty-tensors -one-shot-bufferize on the IR after --linalg-to-tensor-ext, and got the following error

error: op was not bufferized
        %8 = tensor_ext.rotate %arg4, %c1 : tensor<1x4xf32>, index

could you give the IR just after stablehlo-opt --stablehlo-legalize-to-linalg, maybe we need to lower linalg earlier before entering secretize

@j2kun
Copy link
Collaborator

j2kun commented Nov 14, 2024

Well we can't lower linalg fully before running the halevi shoup pass (linalg-to-tensor-ext) so maybe another option is adding the BufferizableOpInterface to our tensor ext ops...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants