Skip to content

Feature: MXFP4 support, NVFP4, and block-scaling #298

@ashvardanian

Description

@ashvardanian

Describe what you are looking for

The library already has emerging support for sub-byte types, such as 4-bit integers. Extending it to MXFP4 is fairly trivial, but it also increases the library's surface area. Moreover, most LLM usecases prefer NVFP4 and leverage “block scaling”. I believe block scaling is a fundamentally flawed design that introduces a destructive bias into the model for short-term gains. This issue is the right place to discuss related concerns.

Can you contribute to the implementation?

  • I can contribute

Is your feature request specific to a certain interface?

It applies to everything

Contact Details

No response

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions