FastMindAPI

An easy-to-use, high-performance(?) backend for serving LLMs and other AI models, built on FastAPI.

🚀 Quick Start

Install

pip install fastmindapi

Use

Run the server

# in Shell
fastmindapi-server --port 8000

# in Python
import fastmindapi as FM

server = FM.Server()
server.run()

Access via client / HTTP requests

curl http://IP:PORT/docs#/

import fastmindapi as FM

client = FM.Client(IP="x.x.x.x", PORT=xxx) # 127.0.0.1:8000 for default

client.add_model_info_list(model_info_list)
client.load_model(model_name)
client.generate(model_name, generation_request)

🪧 We primarily maintain the backend server; the client is provided for reference only. The main usage is through sending HTTP requests. (We might release FM-GUI in the future.)

✨ Features

Model: Support models with various backends

✅ Transformers
- TransformersCausalLM ( AutoModelForCausalLM)
- PeftCausalLM ( PeftModelForCausalLM )
✅ llama.cpp
- LlamacppLM (Llama)
MLC LLM
vllm
...

Modules: More than just chatting with models

Function Calling (extra tools in Python)
Retrieval
Agent
...

Flexibility: Easy to Use & Highly Customizable

Load the model when coding / runtime
Add any APIs you want

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

FastMindAPI

🚀 Quick Start

Install

Use

Run the server

Access via client / HTTP requests

✨ Features

Model: Support models with various backends

Modules: More than just chatting with models

Flexibility: Easy to Use & Highly Customizable

Files

README.md

Latest commit

History

README.md

File metadata and controls

FastMindAPI

🚀 Quick Start

Install

Use

Run the server

Access via client / HTTP requests

✨ Features

Model: Support models with various backends

Modules: More than just chatting with models

Flexibility: Easy to Use & Highly Customizable