examples

[Version] Bump version to 0.2.78 (#653 )

Jan 21, 2025

632d347 · Jan 21, 2025

Name	Name	Last commit message	Last commit date
parent directory ..
abort-reload	abort-reload	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
cache-usage	cache-usage	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
chrome-extension-webgpu-service-worker	chrome-extension-webgpu-service-worker	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
chrome-extension	chrome-extension	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
embeddings	embeddings	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
function-calling	function-calling	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
get-started-web-worker	get-started-web-worker	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
get-started	get-started	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
json-mode	json-mode	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
json-schema	json-schema	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
logit-processor	logit-processor	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
multi-models	multi-models	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
multi-round-chat	multi-round-chat	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
next-simple-chat	next-simple-chat	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
seed-to-reproduce	seed-to-reproduce	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
service-worker	service-worker	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
simple-chat-js	simple-chat-js	[Model] Add Llama 3.1 to prebuilt models (#513 )	Jul 23, 2024
simple-chat-ts	simple-chat-ts	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
simple-chat-upload	simple-chat-upload	[WorkerHandler][Breaking] Create MLCEngine in worker handler internal…	Jun 12, 2024
streaming	streaming	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
text-completion	text-completion	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
vision-model	vision-model	[Version] Bump version to 0.2.78 (#653 )	Jan 21, 2025
.gitignore	.gitignore	Major overhaul and flow standardization (#113 )	May 25, 2023
README.md	README.md	[Vision] Support Phi-3.5-vision, the first VLM in WebLLM (#563 )	Sep 23, 2024

README.md

Awesome WebLLM

This page contains a curated list of examples, tutorials, blogs about WebLLM usecases. Please send a pull request if you find things that belongs to here.

Example Projects

Note that all examples below run in-browser and use WebGPU as a backend.

Project List

get-started: minimum get started example with chat completion.
simple-chat-js: a mininum and complete chat bot app in vanilla JavaScript.
simple-chat-ts: a mininum and complete chat bot app in TypeScript.
get-started-web-worker: same as get-started, but using web worker.
next-simple-chat: a mininum and complete chat bot app with Next.js.
multi-round-chat: while APIs are functional, we internally optimize so that multi round chat usage can reuse KV cache
text-completion: demonstrates API engine.completions.create(), which is pure text completion with no conversation, as opposed to engine.chat.completions.create()
embeddings: demonstrates API engine.embeddings.create(), integration with EmbeddingsInterface and MemoryVectorStore of Langchain.js, and RAG with Langchain.js using WebLLM for both LLM and Embedding in a single engine
multi-models: demonstrates loading multiple models in a single engine concurrently

Advanced OpenAI API Capabilities

These examples demonstrate various capabilities via WebLLM's OpenAI-like API.

streaming: return output as chunks in real-time in the form of an AsyncGenerator
json-mode: efficiently ensure output is in json format, see OpenAI Reference for more.
json-schema: besides guaranteeing output to be in JSON, ensure output to adhere to a specific JSON schema specified the user
seed-to-reproduce: use seeding to ensure reproducible output with fields seed.
function-calling (WIP): function calling with fields tools and tool_choice (with preliminary support).
vision-model: process request with image input using Vision Language Model (e.g. Phi3.5-vision)

Chrome Extension

chrome-extension: chrome extension that does not have a persistent background
chrome-extension-webgpu-service-worker: chrome extension using service worker, hence having a persistent background

Others

logit-processor: while logit_bias is supported, we additionally support stateful logit processing where users can specify their own rules. We also expose low-level API forwardTokensAndSample().
cache-usage: demonstrates how WebLLM supports both the Cache API and IndexedDB cache, and users can pick with appConfig.useIndexedDBCache. Also demonstrates various cache utils such as checking whether a model is cached, deleting a model's weights from cache, deleting a model library wasm from cache, etc.
simple-chat-upload: demonstrates how to upload local models to WebLLM instead of downloading via a URL link

Demo Spaces

web-llm-embed: document chat prototype using react-llm with transformers.js embeddings
DeVinci: AI chat app based on WebLLM and hosted on decentralized cloud platform

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

examples

examples

README.md

Awesome WebLLM

Example Projects

Project List

Advanced OpenAI API Capabilities

Chrome Extension

Others

Demo Spaces

Files

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

README.md

Awesome WebLLM

Example Projects

Project List

Advanced OpenAI API Capabilities

Chrome Extension

Others

Demo Spaces