GitHub - cactus-compute/cactus: Cross-platform framework for deploying LLM/VLM/TTS models locally on smartphones.

Cross-platform framework for deploying LLM/VLM/TTS models locally in your app.

Available in Flutter and React-Native for cross-platform developers.
Supports any GGUF model you can find on Huggingface; Qwen, Gemma, Llama, DeepSeek etc.
Run LLMs, VLMs, Embedding Models, TTS models and more.
Accommodates from FP32 to as low as 2-bit quantized models, for efficiency and less device strain.
MCP tool-calls to make AI performant and helpful (set reminder, gallery search, reply messages) etc.
Fallback to massive cloud models for complex tasks and upon device failures.
Chat templates with Jinja2 support and token streaming.

CLICK TO JOIN OUR DISCORD!

CLICK TO VISUALISE AND QUERY REPO

Install: Execute the following command in your project terminal:
```
flutter pub add cactus
```

Flutter Text Completion

import 'package:cactus/cactus.dart';

final lm = await CactusLM.init(
    modelUrl: 'huggingface/gguf/link',
    contextSize: 2048,
);

final messages = [ChatMessage(role: 'user', content: 'Hello!')];
final response = await lm.completion(messages, maxTokens: 100, temperature: 0.7);

Flutter Embedding

import 'package:cactus/cactus.dart';

final lm = await CactusLM.init(
    modelUrl: 'huggingface/gguf/link',
    contextSize: 2048,
    generateEmbeddings: true,
);

final text = 'Your text to embed';
final result = await lm.embedding(text);

Flutter VLM Completion

import 'package:cactus/cactus.dart';

final vlm = await CactusVLM.init(
    modelUrl: 'huggingface/gguf/link',
    mmprojUrl: 'huggingface/gguf/mmproj/link',
);

final messages = [ChatMessage(role: 'user', content: 'Describe this image')];

final response = await vlm.completion(
    messages, 
    imagePaths: ['/absolute/path/to/image.jpg'],
    maxTokens: 200,
    temperature: 0.3,
);

Flutter Cloud Fallback

import 'package:cactus/cactus.dart';

final lm = await CactusLM.init(
    modelUrl: 'huggingface/gguf/link',
    contextSize: 2048,
    cactusToken: 'enterprise_token_here', 
);

final messages = [ChatMessage(role: 'user', content: 'Hello!')];
final response = await lm.completion(messages, maxTokens: 100, temperature: 0.7);

// local (default): strictly only run on-device
// localfirst: fallback to cloud if device fails
// remotefirst: primarily remote, run local if API fails
// remote: strictly run on cloud 
final embedding = await lm.embedding('Your text', mode: 'localfirst');

N/B: See the Flutter Docs for more.

Install the cactus-react-native package:

npm install cactus-react-native && npx pod-install

React-Native Text Completion

import { CactusLM } from 'cactus-react-native';

const { lm, error } = await CactusLM.init({
    model: '/path/to/model.gguf',
    n_ctx: 2048,
});

const messages = [{ role: 'user', content: 'Hello!' }];
const params = { n_predict: 100, temperature: 0.7 };
const response = await lm.completion(messages, params);

React-Native Embedding

import { CactusLM } from 'cactus-react-native';

const { lm, error } = await CactusLM.init({
    model: '/path/to/model.gguf',
    n_ctx: 2048,
    embedding: True,
});

const text = 'Your text to embed';
const params = { normalize: true };
const result = await lm.embedding(text, params);

React-Native VLM

import { CactusVLM } from 'cactus-react-native';

const { vlm, error } = await CactusVLM.init({
    model: '/path/to/vision-model.gguf',
    mmproj: '/path/to/mmproj.gguf',
});

const messages = [{ role: 'user', content: 'Describe this image' }];

const params = {
    images: ['/absolute/path/to/image.jpg'],
    n_predict: 200,
    temperature: 0.3,
};

const response = await vlm.completion(messages, params);

React-Native Cloud Fallback

import { CactusLM } from 'cactus-react-native';

const { lm, error } = await CactusLM.init({
    model: '/path/to/model.gguf',
    n_ctx: 2048,
}, undefined, 'enterprise_token_here');

const messages = [{ role: 'user', content: 'Hello!' }];
const params = { n_predict: 100, temperature: 0.7 };
const response = await lm.completion(messages, params);

// local (default): strictly only run on-device
// localfirst: fallback to cloud if device fails
// remotefirst: primarily remote, run local if API fails
// remote: strictly run on cloud 
const embedding = await lm.embedding('Your text', undefined, 'localfirst');

N/B: See the React Docs for more.

Cactus backend is written in C/C++ and can run directly on phones, smart tvs, watches, speakers, cameras, laptops etc. See the C++ Docs for more.

First, clone the repo with git clone https://github.com/cactus-compute/cactus.git, cd into it and make all scripts executable with chmod +x scripts/*.sh

Flutter
- Build the Android JNILibs with scripts/build-flutter-android.sh.
- Build the Flutter Plugin with scripts/build-flutter.sh. (MUST run before using example)
- Navigate to the example app with cd flutter/example.
- Open your simulator via Xcode or Android Studio, walkthrough if you have not done this before.
- Always start app with this combo flutter clean && flutter pub get && flutter run.
- Play with the app, and make changes either to the example app or plugin as desired.
React Native
- Build the Android JNILibs with scripts/build-react-android.sh.
- Build the Flutter Plugin with scripts/build-react.sh.
- Navigate to the example app with cd react/example.
- Setup your simulator via Xcode or Android Studio, walkthrough if you have not done this before.
- Always start app with this combo yarn && yarn ios or yarn && yarn android.
- Play with the app, and make changes either to the example app or package as desired.
- For now, if changes are made in the package, you would manually copy the files/folders into the examples/react/node_modules/cactus-react-native.
C/C++
- Navigate to the example app with cd cactus/example.
- There are multiple main files main_vlm, main_llm, main_embed, main_tts.
- Build both the libraries and executable using build.sh.
- Run with one of the executables ./cactus_vlm, ./cactus_llm, ./cactus_embed, ./cactus_tts.
- Try different models and make changes as desired.
Contributing
- To contribute a bug fix, create a branch after making your changes with git checkout -b <branch-name> and submit a PR.
- To contribute a feature, please raise as issue first so it can be discussed, to avoid intersecting with someone else.
- Join our discord

Device	Gemma3 1B Q4 (toks/sec)	Qwen3 4B Q4 (toks/sec)
iPhone 16 Pro Max	54	18
iPhone 16 Pro	54	18
iPhone 16	49	16
iPhone 15 Pro Max	45	15
iPhone 15 Pro	45	15
iPhone 14 Pro Max	44	14
OnePlus 13 5G	43	14
Samsung Galaxy S24 Ultra	42	14
iPhone 15	42	14
OnePlus Open	38	13
Samsung Galaxy S23 5G	37	12
Samsung Galaxy S24	36	12
iPhone 13 Pro	35	11
OnePlus 12	35	11
Galaxy S25 Ultra	29	9
OnePlus 11	26	8
iPhone 13 mini	25	8
Redmi K70 Ultra	24	8
Xiaomi 13	24	8
Samsung Galaxy S24+	22	7
Samsung Galaxy Z Fold 4	22	7
Xiaomi Poco F6 5G	22	6

We provide a colleaction of recommended models on our HuggingFace Page

Name		Name	Last commit message	Last commit date
Latest commit History 232 Commits
assets		assets
cpp		cpp
flutter		flutter
ios		ios
react		react
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 9

Languages

License

cactus-compute/cactus

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 9

Languages

Packages