local-deplyoment-llamafile

Project to help you get started deploying Generative AI models locally using Llamafile and friends. Llamfile aims to make open-source LLMs more accessible to both developers and end users by combining LLaMA C++ with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation.

What llamafile gives you is a fun web GUI chatbot, a turnkey OpenAI API compatible server, and a shell-scriptable CLI interface which together put you in control of artificial intelligence.

In addition to LLamfile, this project will help you get started with two related projects.

Whisperfile: Combines wHIsper C++, which provides high-performance inference of OpenAI's Whisper automatic speech recognition (ASR) model, with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation.
Sdfile: Combines which provides high-performance inference of Stable Diffusion and Flux in pure C/C++, with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation.

Installation

Miniforge, Conda, Mamba (Mac OS, Linux, Windows)

If you haven't already done so, install Miniforge. Miniforge provides minimal installers for Conda and Mamba specific to conda-forge, with the following features pre-configured:

Packages in the base environment are obtained from the conda-forge channel.
The conda-forge channel is set as the default (and only) channel.

Conda/mamba will be the primary package managers used to install the required Python dependencies. For convenience, a script is included that will download and install Miniforge, Conda, and Mamba. You can run the script using the following command.

./bin/install-miniforge.sh

Creating the Conda environment

After adding any necessary dependencies that should be downloaded via conda to the environment.yml file and any dependencies that should be downloaded via pip to the requirements.txt file you create the Conda environment in a sub-directory ./envof your project directory by running the following shell script.

./bin/create-conda-env.sh

Support for NVIDIA GPU acceleration

If you have an NVIDIA GPU, the in order to support GPU acceleration you need to install cuda-toolkit from the nvidia Conda channel. This change is made in the environment-nvidia-gpu.yml file. Create the Conda environment in a sub-directory ./envof your project directory by running the following shell script.

./bin/create-conda-env.sh environment-nvidia-gpu.yml

Installing Llamafile (and friends!)

After creating the Conda environment you can install Llamafile (and Whisperfile and Sdfile) by running the following command.

conda run --prefix ./env --live-stream ./bin/install-llamafile.sh

This command does the following.

Properly configures the Conda environment.
Downloads a recent version of Llamafile.
Installs Llamafile binary into the bin/ directory of the Conda environment.

By default, this script downloads a recent version of Llamafile. You can install a specific release by passing the version number as a command line argument to the script as follows.

conda run --prefix ./env --live-stream ./bin/install-llamafile.sh 0.8.13

Building Llamafile (and friends!) from source (optional)

After creating the Conda environment you can build Llamafile (and Whisperfile, Sdfile, and Llamafiler) by running the following command.

conda run --prefix ./env --live-stream ./bin/build-llamafile.sh

Activating the Conda environment

Once the new environment has been created you can activate the environment with the following command.

conda activate ./env

Note that the ./env directory is not under version control as it can always be re-created as necessary.

Funding Acknowledgment

This project is supported by funding from King Abdullah University of Science and Technology (KAUST) - Center of Excellence for Generative AI, under award number 5940.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
bin		bin
data		data
docker		docker
models		models
notebooks		notebooks
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment-nvidia-gpu.yml		environment-nvidia-gpu.yml
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

local-deplyoment-llamafile

Installation

Miniforge, Conda, Mamba (Mac OS, Linux, Windows)

Creating the Conda environment

Support for NVIDIA GPU acceleration

Installing Llamafile (and friends!)

Building Llamafile (and friends!) from source (optional)

Activating the Conda environment

Funding Acknowledgment

About

Releases

Packages

Languages

License

kaust-generative-ai/local-deployment-llamafile

Folders and files

Latest commit

History

Repository files navigation

local-deplyoment-llamafile

Installation

Miniforge, Conda, Mamba (Mac OS, Linux, Windows)

Creating the Conda environment

Support for NVIDIA GPU acceleration

Installing Llamafile (and friends!)

Building Llamafile (and friends!) from source (optional)

Activating the Conda environment

Funding Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages