Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CB]: allow int8 KV cache precision for CPU #1552

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ilya-lavrenov
Copy link
Contributor

No description provided.

@ilya-lavrenov ilya-lavrenov added this to the 2025.0 milestone Jan 15, 2025
@github-actions github-actions bot added category: continuous batching Continuous batching category: LLM LLM pipeline (stateful, static) no-match-files labels Jan 15, 2025
@ilya-lavrenov ilya-lavrenov force-pushed the cb-by-default-int8-respect-ir branch from 9f5e415 to 6586dc3 Compare January 15, 2025 10:28
@github-actions github-actions bot added the category: speculative decoding Speculative decoding label Jan 15, 2025
@ilya-lavrenov ilya-lavrenov force-pushed the cb-by-default-int8-respect-ir branch from 6586dc3 to 05f79a8 Compare January 15, 2025 10:53
@ilya-lavrenov ilya-lavrenov self-assigned this Jan 16, 2025
@andrei-kochin andrei-kochin marked this pull request as ready for review January 20, 2025 13:38
@ilya-lavrenov ilya-lavrenov force-pushed the cb-by-default-int8-respect-ir branch 2 times, most recently from ae7707a to 734c511 Compare January 22, 2025 07:32
@ilya-lavrenov ilya-lavrenov force-pushed the cb-by-default-int8-respect-ir branch 4 times, most recently from 8654d1f to f4a02cb Compare January 22, 2025 08:03
@andrei-kochin andrei-kochin modified the milestones: 2025.0, 2025.1 Jan 27, 2025
@ilya-lavrenov ilya-lavrenov force-pushed the cb-by-default-int8-respect-ir branch from 788de0f to 7d37f5e Compare January 30, 2025 14:08
@ilya-lavrenov ilya-lavrenov force-pushed the cb-by-default-int8-respect-ir branch from 7b19c65 to 379f302 Compare February 6, 2025 09:07
@ilya-lavrenov ilya-lavrenov changed the title Cb by default int8 respect ir [CB]: allow int8 KV cache precision for CPU Feb 6, 2025
@github-actions github-actions bot removed the category: LLM LLM pipeline (stateful, static) label Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants