Skip to content

Zuellni/LLaSA-WebUI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLaSA WebUI

A simple web interface for LLaSA using ExLlamaV2 with an OpenAI compatible FastAPI server.

Installation

Clone the repo:

git clone https://github.com/zuellni/llasa-webui
cd llasa-webui

Create a conda/mamba/python env:

conda create -n llasa-webui python=3.12
conda activate llasa-webui

Install dependencies, ignore any xcodec2 errors:

pip install -r requirements.txt
pip install xcodec2 --no-deps

If you want to use torch+cu126, keep in mind that you'll need to compile exllamav2 and (optionally) flash-attn, and for python=3.13 you may need to compile sentencepiece.

Usage

python server.py --model <path or repo id>

You can use the HF models or EXL2 quants from here. Add --cache q4 --dtype bf16 for less VRAM usage.

Preview

Preview