Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml quantization in thunder v0 #754

Merged
merged 9 commits into from
Jul 19, 2024
Merged

ggml quantization in thunder v0 #754

merged 9 commits into from
Jul 19, 2024

Conversation

t-vi
Copy link
Collaborator

@t-vi t-vi commented Jul 11, 2024

Notebook demoing a ggml transform (without specialized kernels yet, but e.g. reducing memory consumption).

The top part is just the litgpt generation demo litgpt itself.

Note that I needed to add an eval hack for "meta" device in check same tensors in order to be able to trace through the model with meta weights.

@t-vi t-vi requested review from mruberry and lantiga as code owners July 11, 2024 05:14
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@lantiga
Copy link
Collaborator

lantiga commented Jul 19, 2024

@t-vi tests pass now, should we? : )

@t-vi t-vi merged commit b1cf1bf into main Jul 19, 2024
36 checks passed
@t-vi t-vi deleted the tom/ggmlquant branch July 19, 2024 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants