Skip to content

[Feature Request] MLADecodeWithKVCacheOp implementation #113

@xZacky

Description

@xZacky

Parent Issue

Part of #111

Task Type

  • L2: Op Implementation (Wrapper + Unit Tests + Benchmarks)

Description

Checklist

  • Implementation follows Google Python Style for code and docstrings.
  • (L2 Only) Unit tests match PyTorch reference (FP16/BF16).
  • (L2 Only) Benchmarks implemented (Latency/TFLOPS/Bandwidth).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions