Skip to content

Conversation

@NeoZhangJianyu
Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu commented Dec 6, 2025

Fix issue: #17643

  • Support to load gpt-oss into VRAM by supporting OPs:
    • add-id
    • mul_mat for mxfp4
    • swiglu_oai
  • Fix warning
  • Increase mul_mat UT case pass rate. UT case 100% passed.
  • Update ops.md

Known issue:
The performance of gpt-oss-20B-Q8_0 is decreased compared to handling above 3 OPs on CPU.
The 3 OPs need to optimize on GPU.

@github-actions github-actions bot added documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Dec 6, 2025
@NeoZhangJianyu NeoZhangJianyu requested review from qnixsynapse and removed request for ggerganov December 10, 2025 03:12
@qnixsynapse
Copy link
Collaborator

@NeoZhangJianyu I don't have write access to this repo anymore, so my review would be useless. But I still gave an approval.

@ggerganov
Copy link
Member

@NeoZhangJianyu I don't have write access to this repo anymore, so my review would be useless.

The reviews even without write access are still useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants