Skip to content

Feature: Traditional GEMM / MatMul API #312

@ternaus

Description

@ternaus

Describe what you are looking for

Summary

NumKong has dot(a, b) for vectors and for matrix–vector (e.g. A (n, d) @ b (d,) → (n,)), and dots_pack / dots_packed for the GEMM-style case where one side is pre-packed and we compute A @ B.T. There is no direct general matrix multiplication for two 2D arrays: A (m, k) @ B (k, n) → (m, n), i.e. a single API equivalent to NumPy’s np.matmul / @ or OpenCV’s cv2.gemm.

Analogous API: OpenCV cv2.gemm

OpenCV provides cv2.gemm: generalized matrix multiply alpha * A * B + beta * C. With alpha=1, beta=0, and C=None, this is standard matrix multiply A @ B. It accepts 2D matrices and returns 2D. We use it in an image-augmentation codebase for both small matrices and for operations that involve image-sized data (see below).

How we use it today

1. Small matrices (direct cv2.gemm)

  • Thin Plate Spline (TPS): Pairwise distances cv2.gemm(points1, points2.T, 1, None, 0) → (N, 2) @ (2, M) → (N, M); transform: cv2.gemm(kernel_matrix, nonlinear_weights, 1, None, 0) and cv2.gemm(affine_terms, affine_weights, 1, None, 0) (small matrices).
  • Stain normalization (Macenko): cv2.gemm(angle_to_vector, principal_eigenvectors_t, 1, None, 0) — (2, 2) @ (2, 3) → (2, 3).

So we need general matmul for small 2D × 2D matrices, same semantics as cv2.gemm(A, B, 1, None, 0).

2. "Image × small matrix" (NumPy @ today)

In stain normalization we have optical_density shape (num_pixels, 3) (flattened image) and small matrices like stain_colors (3, 2). We do e.g. optical_density @ stain_colors.T → (num_pixels, 2). So the first operand can be image-sized, the second small — still general matmul A @ B, not element-wise image multiply.

Request

Provide a general matrix multiplication API that: (1) accepts two 2D arrays A (m, k) and B (k, n), (2) returns C (m, n) = A @ B, (3) works for both small×small and large×small. Either a dedicated matmul(A, B) (or gemm(A, B, alpha, C, beta)) or generalize dot so that when both arguments are 2D with compatible shapes it performs full matmul instead of "Vector dimensions don't match".

Why not only dots_packed?

dots_packed(A, dots_pack(B)) gives A @ B.T. To get A @ B we have to pack B.T and call dots_packed(A, dots_pack(B.T)), which is awkward. A direct matmul(A, B) would match NumPy/OpenCV usage and our existing cv2.gemm and optical_density @ stain_matrix patterns.

Can you contribute to the implementation?

  • I can contribute

Is your feature request specific to a certain interface?

It applies to everything

Contact Details

No response

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions