Flutter MediaPipe Chat

Flutter MediaPipe Chat enables the Flutter community to implement and test AI models locally on Android and iOS, leveraging Google MediaPipe. This solution performs all processing on the device, allowing optimized models to run natively on smartphones. Additionally, it supports training and deploying custom models, providing developers with greater control and opening the door to innovative applications.

Getting Started

Prerequisites

Android: minSdkVersion 24 (required by MediaPipe).
iOS: iOS 13.0 or later.

Installation

To add the package from the console, run:

flutter pub add flutter_mediapipe_chat

dependencies:
  flutter:
    sdk: flutter
  flutter_mediapipe_chat: ^1.0.0

Then run:

flutter pub get

Android Setup

In your android/app/build.gradle, make sure to include:

android {
   defaultConfig {
      minSdkVersion 24
   }
}

In AndroidManifest.xml (usually android/app/src/main/AndroidManifest.xml), after the </activity> tag, add:

<uses-native-library android:name="libOpenCL.so" android:required="false"/>
<uses-native-library android:name="libOpenCL-car.so" android:required="false"/>
<uses-native-library android:name="libOpenCL-pixel.so" android:required="false"/>

Plugin Usage

After installation and setup, you can start using flutter_mediapipe_chat:

Loading the Model

import 'package:flutter_mediapipe_chat/flutter_mediapipe_chat.dart';

final chatPlugin = FlutterMediapipeChat();

final config = ModelConfig(
  path: "assets/models/gemma-2b-it-gpu-int8.bin",
  temperature: 0.7,
  maxTokens: 1024,
  topK: 50,
  randomSeed: 42,
  loraPath: null,
);

await chatPlugin.loadModel(config);

Generate Responses (Synchronous)

String? response = await chatPlugin.generateResponse("Hello, how are you?");
if (response != null) {
  print("Model Response: $response");
} else {
  print("No response from model.");
}

Generate Responses (Streaming)

chatPlugin
   .generateResponseAsync("Tell me a story about a brave knight.")
   .listen((token) {
  if (token == null) {
   print("Stream ended.");
  } else {
   print("Token: $token");
  }
});

Example Project

Inside the example/ folder there is a demo project showing model loading, a chat interface, and both synchronous and asynchronous text generation. To run it:

cd example
flutter run

Overview

This plugin simplifies on-device local LLM inference (thanks to the MediaPipe framework) in Flutter for Android and iOS, removing the need for cloud services.

Note: The MediaPipe LLM Inference API is experimental and under active development. Use of this API is subject to the Generative AI Prohibited Use Policy.

Key Features

Local Inference
Avoids network dependencies by running models entirely on-device.
Cross-Platform
Compatible with Android (API 24+) and iOS (13.0+).
Flexible Generation
Choose between synchronous or asynchronous response modes.
Advanced Model Customization
Adjust parameters like temperature, maxTokens, topK, randomSeed, and optional LoRA configurations.
GPU/CPU Variants
Select between CPU- or GPU-optimized variants (if the device supports it).

Supported and Custom Models

Built-In Supported Models

Gemma-2 2B (2 billion parameters)
CPU/GPU in int8 variants.
Gemma 2B (2 billion parameters)
CPU/GPU in int4/int8 variants.
Gemma 7B (7 billion parameters, Web only on high-end devices)
GPU int8 variant.

Download these .bin models from Kaggle (Gemma) and load them with FlutterMediapipeChat.

Other Supported Models (Require Conversion)

Falcon-1B
StableLM-3B
Phi-2

They require a script to convert to .bin or .tflite. Check out the AI Edge Torch Generative Library for PyTorch conversions.

Converting Other PyTorch Models

If you have a custom PyTorch model, convert it using AI Edge Torch Generative:

Export your PyTorch model to .tflite.
Combine the .tflite file with tokenizer parameters into a single .task.
Provide the path in ModelConfig.path.

Using LoRA (Low-Rank Adaptation)

LoRA allows inexpensive training of large models by only modifying certain internal ranks. It’s available on GPU backends for:

Gemma (2B, Gemma-2 2B)
Phi-2

Preparing LoRA Weights

Train LoRA weights for your base model.
Convert them with the MediaPipe library specifying the LoRA checkpoint, rank, and GPU backend.

Using LoRA in Flutter

ModelConfig(
  path: "assets/models/base_model_gpu.bin",
  loraPath: "assets/models/lora_model_gpu.bin",
  temperature: 0.8,
  maxTokens: 1024,
  topK: 40,
);

Note: LoRA is only supported in .bin or .tflite GPU models, not CPU.

Model Configuration Reference

Field	Type	Default	Description
`path`	String	Required	Path to the base file (`.bin` or `.task`).
`temperature`	double	0.8	Controls randomness/creativity.
`maxTokens`	int	1024	Maximum number of tokens (input + output).
`topK`	int	40	Limits predictions to the K most probable tokens.
`randomSeed`	int	0	Seed for random text generation.
`loraPath`	String?	null	Path to LoRA weights, GPU models only.
`supportedLoraRanks`	List?	null	For specific LoRA ranks (advanced usage).

Recommended Models & Downloads

Gemma-2 2B (8-bit, CPU/GPU)
Gemma 2B (int4/int8, CPU/GPU)
Gemma 7B (int8, GPU, Web Only)

Consider also Phi-2, StableLM-3B, Falcon-1B after conversion.

Contributing

Contributions are welcome! Send pull requests on GitHub. For questions or feature requests, open an issue in the repository’s tracker.

Legal & License

Licensed under MIT (see LICENSE). Third-party models (e.g., Falcon, StableLM, Phi-2) are not Google services. Make sure to comply with their licenses.
Use of MediaPipe LLM Inference is governed by the Generative AI Prohibited Use Policy.
Gemma is an open family of models derived from the same research as Gemini, subject to licensing terms on Kaggle.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
android		android
example		example
ios		ios
lib		lib
.gitignore		.gitignore
.metadata		.metadata
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
analysis_options.yaml		analysis_options.yaml
pubspec.yaml		pubspec.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flutter MediaPipe Chat

Getting Started

Prerequisites

Installation

Android Setup

Plugin Usage

Example Project

Overview

Key Features

Supported and Custom Models

Built-In Supported Models

Other Supported Models (Require Conversion)

Converting Other PyTorch Models

Using LoRA (Low-Rank Adaptation)

Preparing LoRA Weights

Using LoRA in Flutter

Model Configuration Reference

Recommended Models & Downloads

Contributing

Legal & License

References

About

Releases

Packages

Languages

License

juandpt03/flutter_mediapipe_chat

Folders and files

Latest commit

History

Repository files navigation

Flutter MediaPipe Chat

Getting Started

Prerequisites

Installation

Android Setup

Plugin Usage

Example Project

Overview

Key Features

Supported and Custom Models

Built-In Supported Models

Other Supported Models (Require Conversion)

Converting Other PyTorch Models

Using LoRA (Low-Rank Adaptation)

Preparing LoRA Weights

Using LoRA in Flutter

Model Configuration Reference

Recommended Models & Downloads

Contributing

Legal & License

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages