[Survey] Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization
-
Updated
Dec 2, 2025 - Python
[Survey] Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization
ClearML - Model-Serving Orchestration and Repository Solution
SecretFlow-Serving is a serving system for privacy-preserving machine learning models.
MoDM is a cache-aware, hybrid serving system that accelerates image generation by dynamically combining small and large diffusion models for efficient, high-quality output.
Implementation of an ML Model Serving with Flask, the model is LGBM trained on Kaggle titanic data.
An async ML service built with FastAPI, Celery, RabbitMQ, and Redis for efficient, scalable ML model serving
This repository contains practices related to FastAPI, including its core components such as the Uvicorn server, HTTP requests, and other related features. The primary focus is on using FastAPI for machine learning applications.
Predviđanje rezultata telemarketinga
Dhruva is a full-fledged DPG platform for serving AI models at scale.
Simple web application developed with streamlit for serving Machine Learning Model
Machine Learning (MLeap) Model Serving application for Scala
Add a description, image, and links to the serving-ml topic page so that developers can more easily learn about it.
To associate your repository with the serving-ml topic, visit your repo's landing page and select "manage topics."