DeepSparkInference推理模型库作为DeepSpark开源社区的核心项目,于2024年3月正式开源,一期甄选了48个推理模型示例,涵盖计算机视觉,自然语言处理,语音识别等领域,后续将逐步拓展更多AI领域。
DeepSparkInference中的模型提供了在国产推理引擎IGIE或ixRT下运行的推理示例和指导文档,部分模型提供了基于国产通用GPU智铠100的评测结果。
IGIE(Iluvatar GPU Inference Engine)是基于TVM框架研发的高性能、高通用、全流程的AI推理引擎。支持多框架模型导入、量化、图优化、多算子库支持、多后端支持、算子自动调优等特性,为推理场景提供易部署、高吞吐量、低延迟的完整方案。
ixRT(Iluvatar CoreX RunTime)是天数智芯自研的高性能推理引擎,专注于最大限度发挥天数智芯通用GPU 的性能,实现各领域模型的高性能推理。ixRT支持动态形状推理、插件和INT8/FP16推理等特性。
DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型类别并拓展大模型推理。
| Model | Engine | Supported | IXUCA SDK |
|---|---|---|---|
| Baichuan2-7B | vLLM |
✅ | 4.3.0 |
| ChatGLM-3-6B | vLLM |
✅ | 4.3.0 |
| ChatGLM-3-6B-32K | vLLM |
✅ | 4.3.0 |
| CosyVoice2-0.5B | PyTorch |
✅ | 4.3.0 |
| DeepSeek-R1-Distill-Llama-8B | vLLM |
✅ | 4.3.0 |
| DeepSeek-R1-Distill-Llama-70B | vLLM |
✅ | 4.3.0 |
| DeepSeek-R1-Distill-Qwen-1.5B | vLLM |
✅ | 4.3.0 |
| DeepSeek-R1-Distill-Qwen-7B | vLLM |
✅ | 4.3.0 |
| DeepSeek-R1-Distill-Qwen-14B | vLLM |
✅ | 4.3.0 |
| DeepSeek-R1-Distill-Qwen-32B | vLLM |
✅ | 4.3.0 |
| ERNIE-4.5-21B-A3B | FastDeploy |
✅ | 4.3.0 |
| ERNIE-4.5-300B-A47B | FastDeploy |
✅ | 4.3.0 |
| GLM-4V | vLLM |
✅ | 4.3.0 |
| InternLM3 | LMDeploy |
✅ | 4.3.0 |
| Llama2-7B | vLLM |
✅ | 4.3.0 |
| Llama2-7B | TRT-LLM |
✅ | 4.3.0 |
| Llama2-13B | TRT-LLM |
✅ | 4.3.0 |
| Llama2-70B | TRT-LLM |
✅ | 4.3.0 |
| Llama3-70B | vLLM |
✅ | 4.3.0 |
| E5-V | vLLM |
✅ | 4.3.0 |
| MiniCPM-o | vLLM |
✅ | 4.3.0 |
| MiniCPM-V | vLLM |
✅ | 4.3.0 |
| Qwen-7B | vLLM |
✅ | 4.3.0 |
| Qwen-VL | vLLM |
✅ | 4.3.0 |
| Qwen2-VL | vLLM |
✅ | 4.3.0 |
| Qwen2.5-VL | vLLM |
✅ | 4.3.0 |
| Qwen1.5-7B | vLLM |
✅ | 4.3.0 |
| Qwen1.5-7B | TGI |
✅ | 4.3.0 |
| Qwen1.5-14B | vLLM |
✅ | 4.3.0 |
| Qwen1.5-32B Chat | vLLM |
✅ | 4.3.0 |
| Qwen1.5-72B | vLLM |
✅ | 4.3.0 |
| Qwen2-7B Instruct | vLLM |
✅ | 4.3.0 |
| Qwen2-72B Instruct | vLLM |
✅ | 4.3.0 |
| StableLM2-1.6B | vLLM |
✅ | 4.3.0 |
| Whisper | vLLM |
✅ | 4.3.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| AlexNet | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| CLIP | FP16 | ✅ | ✅ | 4.3.0 |
| Conformer-B | FP16 | ✅ | 4.3.0 | |
| ConvNeXt-Base | FP16 | ✅ | ✅ | 4.3.0 |
| ConvNext-S | FP16 | ✅ | 4.3.0 | |
| ConvNeXt-Small | FP16 | ✅ | ✅ | 4.3.0 |
| ConvNeXt-Tiny | FP16 | ✅ | 4.3.0 | |
| CSPDarkNet53 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| CSPResNet50 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| CSPResNeXt50 | FP16 | ✅ | ✅ | 4.3.0 |
| DeiT-tiny | FP16 | ✅ | ✅ | 4.3.0 |
| DenseNet121 | FP16 | ✅ | ✅ | 4.3.0 |
| DenseNet161 | FP16 | ✅ | ✅ | 4.3.0 |
| DenseNet169 | FP16 | ✅ | ✅ | 4.3.0 |
| DenseNet201 | FP16 | ✅ | ✅ | 4.3.0 |
| EfficientNet-B0 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| EfficientNet-B1 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| EfficientNet-B2 | FP16 | ✅ | ✅ | 4.3.0 |
| EfficientNet-B3 | FP16 | ✅ | ✅ | 4.3.0 |
| EfficientNet-B4 | FP16 | ✅ | ✅ | 4.3.0 |
| EfficientNet-B5 | FP16 | ✅ | ✅ | 4.3.0 |
| EfficientNet-B6 | FP16 | ✅ | 4.3.0 | |
| EfficientNetV2 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| EfficientNetv2_rw_t | FP16 | ✅ | ✅ | 4.3.0 |
| EfficientNetv2_s | FP16 | ✅ | ✅ | 4.3.0 |
| GoogLeNet | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| HRNet-W18 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| InceptionV3 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| Inception-ResNet-V2 | FP16 | ✅ | 4.3.0 | |
| INT8 | ✅ | 4.3.0 | ||
| Mixer_B | FP16 | ✅ | 4.3.0 | |
| MNASNet0_5 | FP16 | ✅ | 4.3.0 | |
| MNASNet0_75 | FP16 | ✅ | 4.3.0 | |
| MNASNet1_0 | FP16 | ✅ | 4.3.0 | |
| MNASNet1_3 | FP16 | ✅ | 4.3.0 | |
| MobileNetV2 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| MobileNetV3_Large | FP16 | ✅ | 4.3.0 | |
| MobileNetV3_Small | FP16 | ✅ | ✅ | 4.3.0 |
| MViTv2_base | FP16 | ✅ | 4.2.0 | |
| RegNet_x_16gf | FP16 | ✅ | 4.3.0 | |
| RegNet_x_1_6gf | FP16 | ✅ | 4.3.0 | |
| RegNet_x_3_2gf | FP16 | ✅ | 4.3.0 | |
| RegNet_x_32gf | FP16 | ✅ | 4.3.0 | |
| RegNet_x_400mf | FP16 | ✅ | 4.3.0 | |
| RegNet_y_1_6gf | FP16 | ✅ | 4.3.0 | |
| RegNet_y_16gf | FP16 | ✅ | 4.3.0 | |
| RegNet_y_3_2gf | FP16 | ✅ | 4.3.0 | |
| RegNet_y_32gf | FP16 | ✅ | 4.3.0 | |
| RegNet_y_400mf | FP16 | ✅ | 4.3.0 | |
| RepVGG | FP16 | ✅ | ✅ | 4.3.0 |
| Res2Net50 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| ResNeSt50 | FP16 | ✅ | 4.3.0 | |
| ResNet101 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| ResNet152 | FP16 | ✅ | 4.3.0 | |
| INT8 | ✅ | 4.3.0 | ||
| ResNet18 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| ResNet34 | FP16 | ✅ | 4.3.0 | |
| INT8 | ✅ | 4.3.0 | ||
| ResNet50 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| ResNetV1D50 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| ResNeXt50_32x4d | FP16 | ✅ | ✅ | 4.3.0 |
| ResNeXt101_64x4d | FP16 | ✅ | ✅ | 4.3.0 |
| ResNeXt101_32x8d | FP16 | ✅ | ✅ | 4.3.0 |
| SEResNet50 | FP16 | ✅ | 4.3.0 | |
| ShuffleNetV1 | FP16 | ✅ | 4.3.0 | |
| ShuffleNetV2_x0_5 | FP16 | ✅ | ✅ | 4.3.0 |
| ShuffleNetV2_x1_0 | FP16 | ✅ | ✅ | 4.3.0 |
| ShuffleNetV2_x1_5 | FP16 | ✅ | ✅ | 4.3.0 |
| ShuffleNetV2_x2_0 | FP16 | ✅ | ✅ | 4.3.0 |
| SqueezeNet 1.0 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| SqueezeNet 1.1 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| SVT Base | FP16 | ✅ | 4.3.0 | |
| Swin Transformer | FP16 | ✅ | 4.3.0 | |
| Swin Transformer Large | FP16 | ✅ | 4.3.0 | |
| Twins_PCPVT | FP16 | ✅ | 4.3.0 | |
| VAN_B0 | FP16 | ✅ | 4.3.0 | |
| VGG11 | FP16 | ✅ | 4.3.0 | |
| VGG13 | FP16 | ✅ | 4.3.0 | |
| VGG13_BN | FP16 | ✅ | 4.3.0 | |
| VGG16 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| VGG19 | FP16 | ✅ | 4.3.0 | |
| VGG19_BN | FP16 | ✅ | 4.3.0 | |
| ViT | FP16 | ✅ | 4.3.0 | |
| Wide ResNet50 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| Wide ResNet101 | FP16 | ✅ | 4.3.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| ATSS | FP16 | ✅ | ✅ | 4.3.0 |
| CenterNet | FP16 | ✅ | ✅ | 4.3.0 |
| DETR | FP16 | ✅ | 4.3.0 | |
| FCOS | FP16 | ✅ | ✅ | 4.3.0 |
| FoveaBox | FP16 | ✅ | ✅ | 4.3.0 |
| FSAF | FP16 | ✅ | ✅ | 4.3.0 |
| GFL | FP16 | ✅ | 4.3.0 | |
| HRNet | FP16 | ✅ | ✅ | 4.3.0 |
| PAA | FP16 | ✅ | ✅ | 4.3.0 |
| RetinaFace | FP16 | ✅ | ✅ | 4.3.0 |
| RetinaNet | FP16 | ✅ | ✅ | 4.3.0 |
| RTMDet | FP16 | ✅ | 4.3.0 | |
| SABL | FP16 | ✅ | 4.3.0 | |
| SSD | FP16 | ✅ | 4.3.0 | |
| YOLOF | FP16 | ✅ | 4.3.0 | |
| YOLOv3 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| YOLOv4 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| YOLOv5 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| YOLOv5s | FP16 | ✅ | 4.3.0 | |
| INT8 | ✅ | 4.3.0 | ||
| YOLOv6 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| YOLOv7 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| YOLOv8 | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| YOLOv9 | FP16 | ✅ | ✅ | 4.3.0 |
| YOLOv10 | FP16 | ✅ | ✅ | 4.3.0 |
| YOLOv11 | FP16 | ✅ | ✅ | 4.3.0 |
| YOLOv12 | FP16 | ✅ | 4.3.0 | |
| YOLOv13 | FP16 | ✅ | 4.3.0 | |
| YOLOX | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| FaceNet | FP16 | ✅ | 4.3.0 | |
| INT8 | ✅ | 4.3.0 |
| Model | Prec. | IGIE | IXUCA SDK |
|---|---|---|---|
| Kie_layoutXLM | FP16 | ✅ | 4.3.0 |
| SVTR | FP16 | ✅ | 4.3.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| HRNetPose | FP16 | ✅ | 4.3.0 | |
| Lightweight OpenPose | FP16 | ✅ | 4.3.0 | |
| RTMPose | FP16 | ✅ | ✅ | 4.3.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| Mask R-CNN | FP16 | ✅ | 4.2.0 | |
| SOLOv1 | FP16 | ✅ | 4.3.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| UNet | FP16 | ✅ | 4.3.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| FastReID | FP16 | ✅ | 4.3.0 | |
| DeepSort | FP16 | ✅ | 4.3.0 | |
| INT8 | ✅ | 4.3.0 | ||
| RepNet-Vehicle-ReID | FP16 | ✅ | 4.3.0 |
| Model | vLLM | IxFormer | IXUCA SDK |
|---|---|---|---|
| Aria | ✅ | 4.3.0 | |
| Chameleon-7B | ✅ | 4.3.0 | |
| CLIP | ✅ | 4.3.0 | |
| Fuyu-8B | ✅ | 4.3.0 | |
| H2OVL Mississippi | ✅ | 4.3.0 | |
| Idefics3 | ✅ | 4.3.0 | |
| InternVL2-4B | ✅ | 4.3.0 | |
| LLaVA | ✅ | 4.3.0 | |
| LLaVA-Next-Video-7B | ✅ | 4.3.0 | |
| Llama-3.2 | ✅ | 4.3.0 | |
| MiniCPM-V 2 | ✅ | 4.3.0 | |
| Pixtral | ✅ | 4.3.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| ALBERT | FP16 | ✅ | 4.3.0 | |
| BERT Base NER | INT8 | ✅ | 4.3.0 | |
| BERT Base SQuAD | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | 4.3.0 | ||
| BERT Large SQuAD | FP16 | ✅ | ✅ | 4.3.0 |
| INT8 | ✅ | ✅ | 4.3.0 | |
| DeBERTa | FP16 | ✅ | 4.3.0 | |
| RoBERTa | FP16 | ✅ | 4.3.0 | |
| RoFormer | FP16 | ✅ | 4.3.0 | |
| VideoBERT | FP16 | ✅ | 4.2.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| Conformer | FP16 | ✅ | ✅ | 4.3.0 |
| Transformer ASR | FP16 | ✅ | 4.2.0 |
| Model | Prec. | IGIE | ixRT | IXUCA SDK |
|---|---|---|---|---|
| Wide & Deep | FP16 | ✅ | 4.3.0 |
| Docker Installer | IXUCA SDK | Introduction |
|---|---|---|
| corex-docker-installer-4.3.0-*-py3.10-x86_64.run | 4.3.0 | 适用小模型推理 |
| corex-docker-installer-4.3.0-*-llm-py3.10-x86_64.run | 4.3.0 | 适用大模型推理 |
请参见 DeepSpark Code of Conduct on Gitee or on GitHub。
请参见 DeepSparkInference Contributing Guidelines。
DeepSparkInference仅提供公共数据集的下载和预处理脚本。这些数据集不属于DeepSparkInference,DeepSparkInference也不对其质量或维护负责。请确保您具有这些数据集的使用许可,基于这些数据集训练的模型仅可用于非商业研究和教育。
致数据集所有者:
如果不希望您的数据集公布在DeepSparkInference上或希望更新DeepSparkInference中属于您的数据集,请在Gitee或Github上提交issue,我们将按您的issue删除或更新。衷心感谢您对我们社区的支持和贡献。
本项目许可证遵循Apache-2.0。