-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
roadmapPlan or future workPlan or future work
Description
Roadmap
We are not content with merely publishing a paper; our goal is to drive this project into production. Here is a list of our pending tasks.
Integration
- SGLang support (@DerekHJH is working on it)
- LMCache support
Model Support
- Qwen (@DerekHJH is working on it)
- ...
User Interface
- Support chat template only and tokenize inside COMB (@shijuzhao is working on it)
- ...
Feature
- Batching (@shijuzhao is working on it)
- Support chunk token and cross-attention mask
- ...
Hardware Support
- AMD ROCm
- ...
Distributed
- Encoder-decoder disaggregation
- TP, PP
Optimization
- Enable CUDA graph through custom operator (@shijuzhao is working on it)
- Separated PIC process
- ...
Reactions are currently unavailable
Metadata
Metadata
Labels
roadmapPlan or future workPlan or future work