diff --git a/README.md b/README.md
index aae7e407c..e0db1fa19 100644
--- a/README.md
+++ b/README.md
@@ -1,446 +1,172 @@
-# Cortex.cpp
+# Cortex
-
-
+
-
-
- Documentation - API Reference
- - Changelog - Bug reports - Discord
+ Docs •
+ API Reference •
+ Changelog •
+ Issues •
+ Community
-> **Cortex.cpp is currently in active development.**
-
-## Overview
-
-Cortex is a Local AI API Platform that is used to run and customize LLMs.
-
-Key Features:
-- Pull from Huggingface, or Cortex Built-in Models
-- Models stored in universal file formats (vs blobs)
-- Swappable Engines (default: [`llamacpp`](https://github.com/janhq/cortex.llamacpp), future: [`ONNXRuntime`](https://github.com/janhq/cortex.onnx), [`TensorRT-LLM`](https://github.com/janhq/cortex.tensorrt-llm))
-- Cortex can be deployed as a standalone API server, or integrated into apps like [Jan.ai](https://jan.ai/)
+> **Under Active Development** - Expect rapid improvements!
-Coming soon; now available on [cortex-nightly](#beta--nightly-versions):
-- Engines Management (install specific llama-cpp version and variants)
-- Nvidia Hardware detection & activation (current: Nvidia, future: AMD, Intel, Qualcomm)
-- Cortex's roadmap is to implement the full OpenAI API including Tools, Runs, Multi-modal and Realtime APIs.
-## Local Installation
+Cortex is the open-source brain for robots: vision, speech, language, tabular, and action -- the cloud is optional.
-Cortex has an Local Installer that packages all required dependencies, so that no internet connection is required during the installation process.
+## Installation
-Cortex also has a [Network Installer](#network-installer) which downloads the necessary dependencies from the internet during the installation.
+| Platform | Installer |
+|----------|-----------|
+| **Windows** | [cortex.exe](https://app.cortexcpp.com/download/latest/windows-amd64-local) |
+| **macOS** | [cortex.pkg](https://app.cortexcpp.com/download/latest/mac-universal-local) |
+| **Linux (Debian)** | [cortex.deb](https://app.cortexcpp.com/download/latest/linux-amd64-local) |
-
-
-
-
- MacOS (Silicon/Intel):
- cortex.pkg
-
-
-
-
-- For Linux: Download the installer and run the following command in terminal:
+All other Linux distributions:
```bash
- # Linux debian based distros
- curl -s https://raw.githubusercontent.com/janhq/cortex/main/engine/templates/linux/install.sh | sudo bash -s -- --deb_local
-
- # Other Linux distros
- curl -s https://raw.githubusercontent.com/janhq/cortex/main/engine/templates/linux/install.sh | sudo bash -s
+curl -s https://raw.githubusercontent.com/janhq/cortex/main/engine/templates/linux/install.sh | sudo bash
```
-- The binary will be installed in the `/usr/bin/` directory.
-
-## Usage
-
-### CLI
-
-After installation, you can run Cortex.cpp from the command line by typing `cortex --help`.
+### Start the Server
+```bash
+cortex start
```
-# Run a Model
-cortex pull llama3.2
-cortex pull bartowski/Meta-Llama-3.1-8B-Instruct-GGUF
-cortex run llama3.2
-
-# Resource Management
-cortex ps (view active models & RAM/VRAM used)
-cortex models stop llama3.2
-
-# Available on cortex-nightly:
-cortex engines install llama-cpp -m (lists versions and variants)
-cortex hardware list (hardware detection)
-cortex hardware activate
-
-cortex stop
+```
+Set log level to INFO
+Host: 127.0.0.1 Port: 39281
+Server started
+API Documentation available at: http://127.0.0.1:39281
```
-Refer to our [Quickstart](https://cortex.so/docs/quickstart/) and
-[CLI documentation](https://cortex.so/docs/cli) for more details.
-
-### API:
-Cortex.cpp includes a REST API accessible at `localhost:39281`.
-
-Refer to our [API documentation](https://cortex.so/api-reference) for more details.
-
-## Models
-
-Cortex.cpp allows users to pull models from multiple Model Hubs, offering flexibility and extensive model access:
-- [Hugging Face](https://huggingface.co): GGUF models eg `author/Model-GGUF`
-- Cortex Built-in Models
-
-Once downloaded, the model `.gguf` and `model.yml` files are stored in `~\cortexcpp\models`.
-
-> **Note**:
-> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
-
-### Cortex Built-in Models & Quantizations
+[Full API docs](https://cortex.so/api-reference).
-| Model /Engine | llama.cpp | Command |
-| -------------- | --------------------- | ----------------------------- |
-| phi-3.5 | ✅ | cortex run phi3.5 |
-| llama3.2 | ✅ | cortex run llama3.2 |
-| llama3.1 | ✅ | cortex run llama3.1 |
-| codestral | ✅ | cortex run codestral |
-| gemma2 | ✅ | cortex run gemma2 |
-| mistral | ✅ | cortex run mistral |
-| ministral | ✅ | cortex run ministral |
-| qwen2 | ✅ | cortex run qwen2.5 |
-| openhermes-2.5 | ✅ | cortex run openhermes-2.5 |
-| tinyllama | ✅ | cortex run tinyllama |
+### Download Models
-View all [Cortex Built-in Models](https://cortex.so/models).
+You can download models from the huggingface model hub using the `cortex pull` command:
-Cortex supports multiple quantizations for each model.
+```bash
+cortex pull llama3.2
+```
```
-❯ cortex-nightly pull llama3.2
Downloaded models:
+ llama3.1:8b-gguf-q4-km
llama3.2:3b-gguf-q2-k
Available to download:
- 1. llama3.2:3b-gguf-q3-kl
- 2. llama3.2:3b-gguf-q3-km
- 3. llama3.2:3b-gguf-q3-ks
- 4. llama3.2:3b-gguf-q4-km (default)
- 5. llama3.2:3b-gguf-q4-ks
- 6. llama3.2:3b-gguf-q5-km
- 7. llama3.2:3b-gguf-q5-ks
- 8. llama3.2:3b-gguf-q6-k
- 9. llama3.2:3b-gguf-q8-0
-
-Select a model (1-9):
-```
-
-## Advanced Installation
-
-### Network Installer (Stable)
-
-Cortex.cpp is available with a Network Installer, which is a smaller installer but requires internet connection during installation to download the necessary dependencies.
-
-
-
-
-
-
-
-### Beta & Nightly Versions (Local Installer)
-
-Cortex releases Beta and Nightly versions for advanced users to try new features (we appreciate your feedback!)
-- Beta (early preview): CLI command: `cortex-beta`
-- Nightly (released every night): CLI Command: `cortex-nightly`
- - Nightly automatically pulls the latest changes from upstream [llama.cpp](https://github.com/ggerganov/llama.cpp/) repo, creates a PR and runs tests.
- - If all test pass, the PR is automatically merged into our repo, with the latest llama.cpp version.
-
-
-
-### Network Installer
-
-Cortex.cpp is available with a Network Installer, which is a smaller installer but requires internet connection during installation to download the necessary dependencies.
-
-
-
-### Build from Source
-
-Firstly, clone the Cortex.cpp repository [here](https://github.com/janhq/cortex.cpp) and initialize the submodules:
+ 1. llama3:8b-gguf
+ 2. llama3:8b-gguf-q2-k
+ 3. llama3:8b-gguf-q3-kl
+ 4. ...
-```bash
-git clone https://github.com/janhq/cortex.cpp
-cd cortex.cpp
-git submodule update --init --recursive
+Select a model (1-21):
```
-#### Windows
+### Run Models
-1. Navigate to the `engine` folder.
-2. Configure the vpkg:
-
-```bash
-cd vcpkg
-./bootstrap-vcpkg.bat
-vcpkg install
+```sh
+cortex run llama3.2
+```
+```
+In order to exit, type `exit()`
+>
```
-3. Build the Cortex.cpp inside the `engine/build` folder:
+You can also run it in detached mode, meaning, you can run it in the background and can
+use the model via the API:
-```bash
-mkdir build
-cd build
-cmake .. -DBUILD_SHARED_LIBS=OFF -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder_in_cortex_repo/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows-static
-cmake --build . --config Release
+```sh
+cortex run -d llama3.2:3b-gguf-q2-k
```
-4. Verify that Cortex.cpp is installed correctly by getting help information.
+### Manage resources
```sh
-cortex -h
+cortex ps # View active models
```
-#### MacOS
+```sh
+cortex stop # Shutdown server
+```
-1. Navigate to the `engine` folder.
-2. Configure the vpkg:
+## Why Cortex.cpp?
-```bash
-cd vcpkg
-./bootstrap-vcpkg.sh
-vcpkg install
-```
+Local AI platform for running AI models with:
-3. Build the Cortex.cpp inside the `engine/build` folder:
+- **Multi-Engine Support** - Start with llama.cpp or add your own
+- **Hardware Optimized** - Automatic GPU detection (NVIDIA/AMD/Intel)
+- **OpenAI-Compatible API** - Tools, Runs, and Multi-modal coming soon
-```bash
-mkdir build
-cd build
-cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder_in_cortex_repo/vcpkg/scripts/buildsystems/vcpkg.cmake
-make -j4
-```
+## Featured Models
-4. Verify that Cortex.cpp is installed correctly by getting help information.
+| Model | Command | Min RAM |
+|----------------|---------------------------|---------|
+| Llama 3 8B | `cortex run llama3.1` | 8GB |
+| Phi-4 | `cortex run phi-4` | 8GB |
+| Mistral | `cortex run mistral` | 4GB |
+| Gemma 2B | `cortex run gemma2` | 6GB |
-```sh
-./cortex -h
-```
+[View all supported models →](https://cortex.so/models)
-#### Linux
+## Advanced Features
-1. Navigate to the `engine` folder.
-2. Configure the vpkg:
+See table below for the binaries with the nightly builds.
```bash
-cd vcpkg
-./bootstrap-vcpkg.sh
-vcpkg install
+# Multiple quantizations
+cortex-nightly pull llama3.2 # Choose from several quantization options
```
-3. Build the Cortex.cpp inside the `engine/build` folder:
-
```bash
-mkdir build
-cd build
-cmake .. -DCMAKE_TOOLCHAIN_FILE=path_to_vcpkg_folder_in_cortex_repo/vcpkg/scripts/buildsystems/vcpkg.cmake
-make -j4
+# Engine management (nightly)
+cortex-nightly engines install llama-cpp -m
```
-4. Verify that Cortex.cpp is installed correctly by getting help information.
-
```sh
-./cortex -h
+# Hardware control
+cortex-nightly hardware detect
+cortex-nightly hardware activate
```
-#### Devcontainer / Codespaces
-
-1. Open Cortex.cpp repository in Codespaces or local devcontainer
-
- [](https://codespaces.new/janhq/cortex.cpp?quickstart=1)
-
- ```sh
- devcontainer up --workspace-folder .
- ```
+## Need Help?
-2. Configure vpkg in `engine/vcpkg`:
+- Quick troubleshooting: `cortex --help`
+- [Documentation](https://cortex.so/docs)
+- [Community Discord](https://discord.gg/FTk2MvZwJH)
+- [Report Issues](https://github.com/janhq/cortex.cpp/issues)
-```bash {"tag": "devcontainer"}
-cd engine/vcpkg
-export VCPKG_FORCE_SYSTEM_BINARIES="$([[ $(uname -m) == 'arm64' ]] && echo '1' || echo '0')"
-./bootstrap-vcpkg.sh
-```
-
-3. Build the Cortex.cpp inside the `engine/build` folder:
+---
-```bash {"tag": "devcontainer"}
-cd engine
-mkdir -p build
-cd build
-cmake .. -DCMAKE_TOOLCHAIN_FILE=$(realpath ..)/vcpkg/scripts/buildsystems/vcpkg.cmake
-make -j$(grep -c ^processor /proc/cpuinfo)
-```
+## For Contributors
-4. Verify that Cortex.cpp is installed correctly by getting help information.
+### Development Builds
-```sh {"tag": "devcontainer"}
-cd engine/build
-./cortex -h
-```
+| Version | Windows | macOS | Linux |
+|-----------|---------|-------|-------|
+| **Stable** | [exe](https://app.cortexcpp.com/download/latest/windows-amd64-network) | [pkg](https://app.cortexcpp.com/download/latest/mac-universal-network) | [deb](https://app.cortexcpp.com/download/latest/linux-amd64-network) |
+| **Beta** | [exe](https://app.cortexcpp.com/download/beta/windows-amd64-network) | [pkg](https://app.cortexcpp.com/download/beta/mac-universal-network) | [deb](https://app.cortexcpp.com/download/beta/linux-amd64-network) |
+| **Nightly** | [exe](https://app.cortexcpp.com/download/nightly/windows-amd64-network) | [pkg](https://app.cortexcpp.com/download/nightly/mac-universal-network) | [deb](https://app.cortexcpp.com/download/nightly/linux-amd64-network) |
-5. Everytime a rebuild is needed, just run the commands above using oneliner
+### Build from Source
-```sh
-npx -y runme run --filename README.md -t devcontainer -y
+```bash
+git clone https://github.com/janhq/cortex.cpp
+cd engine/vcpkg && ./bootstrap-vcpkg.sh
+cd ../build && cmake .. && make -j4
```
-## Uninstallation
+## Uninstall Cortex
### Windows
@@ -448,24 +174,18 @@ npx -y runme run --filename README.md -t devcontainer -y
2. Navigate to `Add or Remove Programs`.
3. Search for `cortexcpp` and double click to uninstall. (for beta and nightly builds, search for `cortexcpp-beta` and `cortexcpp-nightly` respectively)
-### MacOs
+### MacOs/Linux
Run the uninstaller script:
```bash
-sudo sh cortex-uninstall.sh
+sudo cortex-uninstall.sh
```
-For MacOS, there is a uninstaller script comes with the binary and added to the `/usr/local/bin/` directory. The script is named `cortex-uninstall.sh` for stable builds, `cortex-beta-uninstall.sh` for beta builds and `cortex-nightly-uninstall.sh` for nightly builds.
-
-### Linux
-
-```bash
-sudo apt remove cortexcpp
-```
+The script to uninstall Cortex comes with the binary and was added to the `/usr/local/bin/` directory. The script is named `cortex-uninstall.sh` for stable builds, `cortex-beta-uninstall.sh` for beta builds and `cortex-nightly-uninstall.sh` for nightly builds.
## Contact Support
- For support, please file a [GitHub ticket](https://github.com/janhq/cortex.cpp/issues/new/choose).
- For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH).
-- For long-form inquiries, please email [hello@jan.ai](mailto:hello@jan.ai).
+- For long-form inquiries, please email [hello@jan.ai](mailto:hello@jan.ai).
\ No newline at end of file
diff --git a/assets/cortex-banner.png b/assets/cortex-banner.png
index 95a7262a9..b01cfb89e 100644
Binary files a/assets/cortex-banner.png and b/assets/cortex-banner.png differ
diff --git a/docs/docs/architecture.mdx b/docs/docs/architecture.mdx
index eae6b1d2f..8e9520810 100644
--- a/docs/docs/architecture.mdx
+++ b/docs/docs/architecture.mdx
@@ -5,10 +5,6 @@ slug: "architecture"
draft: true
---
-:::warning
-🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
-
## Introduction
Cortex is a C++ AI engine designed to operate entirely on your local hardware infrastructure. This headless backend platform is also engineered to support TensorRT-LLM, ensuring high-performance machine-learning model execution. It is packaged with a Docker-inspired command-line interface and a Typescript client library.
diff --git a/docs/docs/architecture/cortex-db.mdx b/docs/docs/architecture/cortex-db.mdx
index e4ead1f0f..52123da4a 100644
--- a/docs/docs/architecture/cortex-db.mdx
+++ b/docs/docs/architecture/cortex-db.mdx
@@ -7,10 +7,6 @@ slug: "cortex-db"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior
-of Cortex, which may not yet be fully implemented in the codebase.
-:::

**db view via [Harlequin](https://harlequin.sh/)**
diff --git a/docs/docs/architecture/cortexrc.mdx b/docs/docs/architecture/cortexrc.mdx
index d32039bca..a19c23afe 100644
--- a/docs/docs/architecture/cortexrc.mdx
+++ b/docs/docs/architecture/cortexrc.mdx
@@ -7,10 +7,6 @@ slug: "cortexrc"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior of
-Cortex, which may not be fully implemented in the codebase yet.
-:::
Cortex supports using a config-based approach to configuring most of its functionality. During the
installation process, a `.cortexrc` will be generated with some sensible defaults in it. Using this
diff --git a/docs/docs/architecture/data-folder.mdx b/docs/docs/architecture/data-folder.mdx
index 9a78b57de..735b746a2 100644
--- a/docs/docs/architecture/data-folder.mdx
+++ b/docs/docs/architecture/data-folder.mdx
@@ -7,10 +7,6 @@ slug: "data-folder"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the
-intended behavior of Cortex and some functionality may not be fully implemented yet.
-:::
When you install Cortex.cpp, three types of files will be generated on your device:
diff --git a/docs/docs/architecture/updater.mdx b/docs/docs/architecture/updater.mdx
index 7ee280453..f454ecb62 100644
--- a/docs/docs/architecture/updater.mdx
+++ b/docs/docs/architecture/updater.mdx
@@ -7,10 +7,6 @@ slug: "updater"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior of
-Cortex, which may not yet be fully implemented in the codebase.
-:::
This document outlines the architectural design for a C++ updater responsible for downloading and executing
the installers for the CLI and Server binaries.
diff --git a/docs/docs/chat-completions.mdx b/docs/docs/chat-completions.mdx
index c4f40f0d1..3c540f266 100644
--- a/docs/docs/chat-completions.mdx
+++ b/docs/docs/chat-completions.mdx
@@ -6,9 +6,6 @@ description: Chat Completions Feature
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
Cortex's Chat API is compatible with OpenAI’s [Chat Completions](https://platform.openai.com/docs/api-reference/chat) endpoint. It is a drop-in replacement for local inference.
@@ -23,8 +20,8 @@ Cortex routes requests to multiple APIs for remote inference while providing a s
## Usage
### CLI
-```bash
-# Streaming
+```bash
+# Streaming
cortex chat --model mistral
```
### API
diff --git a/docs/docs/configurations/cors.mdx b/docs/docs/configurations/cors.mdx
index 5c070ac49..a2454a23f 100644
--- a/docs/docs/configurations/cors.mdx
+++ b/docs/docs/configurations/cors.mdx
@@ -7,11 +7,6 @@ slug: "cors"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior of
-Cortex, which may not yet be fully implemented in the codebase.
-:::
-
This document describes how to configure Cross-Origin Resource Sharing (CORS) settings for the API server
using the CLI commands and the HTTP API endpoints.
diff --git a/docs/docs/configurations/proxy.mdx b/docs/docs/configurations/proxy.mdx
index ad2b7d890..19b277d80 100644
--- a/docs/docs/configurations/proxy.mdx
+++ b/docs/docs/configurations/proxy.mdx
@@ -7,10 +7,6 @@ slug: "proxy"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended
-behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
# Proxy Configuration Guide
diff --git a/docs/docs/configurations/token.mdx b/docs/docs/configurations/token.mdx
index a4c48f7b7..df494f76a 100644
--- a/docs/docs/configurations/token.mdx
+++ b/docs/docs/configurations/token.mdx
@@ -7,10 +7,6 @@ slug: "token"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior of
-Cortex, which may not yet be fully implemented in the codebase.
-:::
A lot of the models available today can be found on HuggingFace. This page describes how to configure
HuggingFace token settings for Cortex.
diff --git a/docs/docs/cortex-cpp.md b/docs/docs/cortex-cpp.md
index 9612164f1..00fffa313 100644
--- a/docs/docs/cortex-cpp.md
+++ b/docs/docs/cortex-cpp.md
@@ -4,10 +4,6 @@ description: Cortex.cpp Architecture
slug: "cortex-cpp"
---
-:::warning
-🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
-
Cortex.cpp is a Local AI engine that is used to run and customize LLMs. Cortex can be deployed as a standalone server, or integrated into apps like [Jan.ai](https://jan.ai/)
Cortex's roadmap is to eventually support full OpenAI API-equivalence.
diff --git a/docs/docs/cortex-llamacpp.mdx b/docs/docs/cortex-llamacpp.mdx
index 82e51a7a8..db2085eb0 100644
--- a/docs/docs/cortex-llamacpp.mdx
+++ b/docs/docs/cortex-llamacpp.mdx
@@ -7,9 +7,6 @@ slug: "cortex-llamacpp"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
:::info
`llamacpp` is formerly called "Nitro".
@@ -67,9 +64,9 @@ The command will check, download, and install these dependencies:
- Cuda 12.2:
- libcublas.so.12
- libcublasLt.so.12
- - libcudart.so.12
+ - libcudart.so.12
- Cuda 12.4:
- - libcublasLt.so.12
+ - libcublasLt.so.12
- libcublas.so.12
```
diff --git a/docs/docs/embeddings.mdx b/docs/docs/embeddings.mdx
index 5cc675ee2..fae8c8a2c 100644
--- a/docs/docs/embeddings.mdx
+++ b/docs/docs/embeddings.mdx
@@ -7,18 +7,15 @@ slug: "embeddings"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
-
An embedding is a vector that represents a piece of text, with the distance between vectors indicating similarity, which means closer distances mean more similar texts, while farther distances mean less similar texts.
+
:::note
The Cortex Embeddings feature is fully compatible with OpenAI's [Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) endpoints.
:::
## Usage
### CLI
-```bash
+```bash
# Without Flag
cortex embeddings "Hello World"
# With model_id Flag
@@ -84,4 +81,4 @@ For a complete list of models, please visit the [Cortex Hub](https://huggingface
Learn more about Embeddings capabilities:
- [Embeddings API Reference](/api-reference#tag/embeddings/post/embeddings)
- [Embeddings CLI command](/docs/cli/embeddings)
-:::
\ No newline at end of file
+:::
diff --git a/docs/docs/guides/function-calling.md b/docs/docs/guides/function-calling.md
index 8be439c77..387cf9b89 100644
--- a/docs/docs/guides/function-calling.md
+++ b/docs/docs/guides/function-calling.md
@@ -1,35 +1,42 @@
---
-title: Function Calling
+title: OpenAI-Compatible Function Calling
---
-# Function calling with OpenAI compatible
-This tutorial, I use the `mistral-nemo:12b-gguf-q4-km` for testing function calling with cortex.cpp. All steps are reproduced from original openai instruction https://platform.openai.com/docs/guides/function-calling
+# Function Calling with Cortex.cpp
-## Step by step with function calling
+This guide demonstrates how to use function calling capabilities with Cortex.cpp that are compatible with the OpenAI API specification. We'll use the `mistral-nemo:12b-gguf-q4-km` model for these examples, following similar patterns to the [OpenAI function calling documentation](https://platform.openai.com/docs/guides/function-calling).
-### 1. Start server and run model.
+## Implementation Guide
+
+### 1. Start the Server
+
+First, launch the Cortex server with your chosen model:
```sh
cortex run -d llama3.1:8b-gguf-q4-km
```
-### 2. Create a python script `function_calling.py` with this content:
+### 2. Initialize the Python Client
+
+Create a new Python script named `function_calling.py` and set up the OpenAI client:
```py
from datetime import datetime
from openai import OpenAI
from pydantic import BaseModel
-```
-```py
+import json
+
+MODEL = "llama3.1:8b-gguf-q4-km"
+
client = OpenAI(
base_url="http://localhost:39281/v1",
- api_key="not-needed"
+ api_key="not-needed" # Authentication is not required for local deployment
)
```
-This step creates OpenAI client in python
+### 3. Implement Function Calling
-### 3. Start create a chat completion with tool calling
+Define your function schema and create a chat completion:
```py
tools = [
@@ -53,14 +60,14 @@ tools = [
}
}
]
-```
-```py
+
completion_payload = {
"messages": [
{"role": "system", "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."},
{"role": "user", "content": "Hi, can you tell me the delivery date for my order?"},
]
}
+
response = client.chat.completions.create(
top_p=0.9,
temperature=0.6,
@@ -68,224 +75,163 @@ response = client.chat.completions.create(
messages=completion_payload["messages"],
tools=tools,
)
-print(response)
```
-Because you didn't provide the `order_id`, the model will ask again
+Since no `order_id` was provided, the model will request it:
-```
+```sh
+# Example Response
ChatCompletion(
- id='1lblzWtLw9h5HG0GjYYi',
- choices=[
- Choice(
- finish_reason=None,
- index=0,
- logprobs=None,
- message=ChatCompletionMessage(
- content='Of course! Please provide your order ID so I can look it up.',
- refusal=None,
- role='assistant',
- audio=None,
- function_call=None,
- tool_calls=None
- )
- )
- ],
- created=1730204306,
- model='_',
- object='chat.completion',
- service_tier=None,
- system_fingerprint='_',
- usage=CompletionUsage(
- completion_tokens=15,
- prompt_tokens=449,
- total_tokens=464,
- completion_tokens_details=None,
- prompt_tokens_details=None
- )
+ id='54yeEjbaFbldGfSPyl2i',
+ choices=[
+ Choice(
+ finish_reason='tool_calls',
+ index=0,
+ logprobs=None,
+ message=ChatCompletionMessage(
+ content='',
+ refusal=None,
+ role='assistant',
+ audio=None,
+ function_call=None,
+ tool_calls=[
+ ChatCompletionMessageToolCall(
+ id=None,
+ function=Function(arguments='{"order_id": "12345"}', name='get_delivery_date'),
+ type='function'
+ )
+ ]
+ )
+ )
+ ],
+ created=1738543890,
+ model='_',
+ object='chat.completion',
+ service_tier=None,
+ system_fingerprint='_',
+ usage=CompletionUsage(
+ completion_tokens=16,
+ prompt_tokens=443,
+ total_tokens=459,
+ completion_tokens_details=None,
+ prompt_tokens_details=None
+ )
)
```
-### 4. Add new message user provide order id
+### 4. Handle User Input
-```
+Once the user provides their order ID:
+
+```python
completion_payload = {
"messages": [
{"role": "system", "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."},
{"role": "user", "content": "Hi, can you tell me the delivery date for my order?"},
{"role": "assistant", "content": "Of course! Please provide your order ID so I can look it up."},
- {"role": "user", "content": "i think it is order_12345"},
+ {"role": "user", "content": "i think it is order_70705"},
]
}
response = client.chat.completions.create(
- top_p=0.9,
- temperature=0.6,
- model=MODEL,
+ model="llama3.1:8b-gguf-q4-km",
messages=completion_payload["messages"],
- tools=tools
-)
-```
-
-The response of the model will be
-
-```
-ChatCompletion(
- id='zUnHwEPCambJtrvWOAQy',
- choices=[
- Choice(
- finish_reason='tool_calls',
- index=0,
- logprobs=None,
- message=ChatCompletionMessage(
- content='',
- refusal=None,
- role='assistant',
- audio=None,
- function_call=None,
- tool_calls=[
- ChatCompletionMessageToolCall(
- id=None,
- function=Function(
- arguments='{"order_id": "order_12345"}',
- name='get_delivery_date'
- ),
- type='function'
- )
- ]
- )
- )
- ],
- created=1730204559,
- model='_',
- object='chat.completion',
- service_tier=None,
- system_fingerprint='_',
- usage=CompletionUsage(
- completion_tokens=23,
- prompt_tokens=483,
- total_tokens=506,
- completion_tokens_details=None,
- prompt_tokens_details=None
- )
+ tools=tools,
+ temperature=0.6,
+ top_p=0.9
)
```
-It can return correct function with arguments
+### 5. Process Function Results
-### 5. Push the response to the conversation and ask model to answer user
+Handle the function call response and generate the final answer:
-```
+```python
+# Simulate function execution
order_id = "order_12345"
delivery_date = datetime.now()
-# Simulate the tool call response
-response = {
- "choices": [
- {
- "message": {
- "role": "assistant",
- "tool_calls": [
- {
- "id": "call_62136354",
- "type": "function",
- "function": {
- "arguments": "{'order_id': 'order_12345'}",
- "name": "get_delivery_date"
- }
- }
- ]
- }
- }
- ]
-}
-
-# Create a message containing the result of the function call
function_call_result_message = {
"role": "tool",
"content": json.dumps({
"order_id": order_id,
"delivery_date": delivery_date.strftime('%Y-%m-%d %H:%M:%S')
}),
- "tool_call_id": response['choices'][0]['message']['tool_calls'][0]['id']
-}
-
-# Prepare the chat completion call payload
-completion_payload = {
- "messages": [
- {"role": "system", "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."},
- {"role": "user", "content": "Hi, can you tell me the delivery date for my order?"},
- {"role": "assistant", "content": "Sure! Could you please provide your order ID so I can look up the delivery date for you?"},
- {"role": "user", "content": "i think it is order_12345"},
- response["choices"][0]["message"],
- function_call_result_message
- ]
+ "tool_call_id": "call_62136354"
}
-client = OpenAI(
- # This is the default and can be omitted
- base_url=ENDPOINT,
- api_key="not-needed"
-)
-
+final_messages = completion_payload["messages"] + [
+ {
+ "role": "assistant",
+ "tool_calls": [{
+ "id": "call_62136354",
+ "type": "function",
+ "function": {
+ "arguments": "{'order_id': 'order_12345'}",
+ "name": "get_delivery_date"
+ }
+ }]
+ },
+ function_call_result_message
+]
+```
+```py
response = client.chat.completions.create(
- top_p=0.9,
- temperature=0.6,
- model=MODEL,
- messages=completion_payload["messages"],
+ model="llama3.1:8b-gguf-q4-km",
+ messages=final_messages,
tools=tools,
+ temperature=0.6,
+ top_p=0.9
)
print(response)
```
-
-The response will include all the content that processed by the function, where the delivery date is produced by query db, ....
-
-```
+```sh
ChatCompletion(
- id='l1xdCuKVMYBSC5tEDlAn',
- choices=[
- Choice(
- finish_reason=None,
- index=0,
- logprobs=None,
- message=ChatCompletionMessage(
- content="Your order with ID 'order_12345' is scheduled to be delivered on October 29, 2024. Is there anything else I can help you with?",
- refusal=None,
- role='assistant',
- audio=None,
- function_call=None,
- tool_calls=None
- )
- )
- ],
- created=1730205470,
- model='_',
- object='chat.completion',
- service_tier=None,
- system_fingerprint='_',
- usage=CompletionUsage(
- completion_tokens=40,
- prompt_tokens=568,
- total_tokens=608,
- completion_tokens_details=None,
- prompt_tokens_details=None
- )
+ id='UMIoW4aNrqKXW2DR1ksX',
+ choices=[
+ Choice(
+ finish_reason='stop',
+ index=0,
+ logprobs=None,
+ message=ChatCompletionMessage(
+ content='The delivery date for your order (order_12345) is February 3, 2025 at 11:53 AM.',
+ refusal=None,
+ role='assistant',
+ audio=None,
+ function_call=None,
+ tool_calls=None
+ )
+ )
+ ],
+ created=1738544037,
+ model='_',
+ object='chat.completion',
+ service_tier=None,
+ system_fingerprint='_',
+ usage=CompletionUsage(
+ completion_tokens=27,
+ prompt_tokens=535,
+ total_tokens=562,
+ completion_tokens_details=None,
+ prompt_tokens_details=None
+ )
)
```
-## Handling parallel function calling
+## Advanced Features
-Cortex cpp support parallel function calling by default
+### Parallel Function Calls
-```
+Cortex.cpp supports calling multiple functions simultaneously:
+
+```python
tools = [
{
"type": "function",
"function": {
"name": "get_delivery_date",
-
- "strict": True,
- "description": "Get the delivery date for a customer's order. Call this whenever you need to know the delivery date, for example when a customer asks 'Where is my package'",
+ "strict": True,
+ "description": "Get the delivery date for a customer's order.",
"parameters": {
"type": "object",
"properties": {
@@ -303,7 +249,7 @@ tools = [
"type": "function",
"function": {
"name": "get_current_conditions",
- "description": "Get the current weather conditions for a specific location",
+ "description": "Get the current weather conditions for a location",
"parameters": {
"type": "object",
"properties": {
@@ -313,8 +259,7 @@ tools = [
},
"unit": {
"type": "string",
- "enum": ["Celsius", "Fahrenheit"],
- "description": "The temperature unit to use. Infer this from the user's location."
+ "enum": ["Celsius", "Fahrenheit"]
}
},
"required": ["location", "unit"]
@@ -322,127 +267,56 @@ tools = [
}
}
]
-
-messages = [
- {"role": "user", "content": "Hi, can you tell me the delivery date for my order order_12345 and check the weather condition in LA?"}
-]
-response = client.chat.completions.create(
- top_p=0.9,
- temperature=0.6,
- model=MODEL,
- messages= messages,
- tools=tools
-)
-print(response)
```
-It will call 2 functions in parallel
+### Controlling Function Execution
-```
-ChatCompletion(
- id='5ot3qux399DojubnBFrG',
- choices=[
- Choice(
- finish_reason='tool_calls',
- index=0,
- logprobs=None,
- message=ChatCompletionMessage(
- content='',
- refusal=None,
- role='assistant',
- audio=None,
- function_call=None,
- tool_calls=[
- ChatCompletionMessageToolCall(
- id=None,
- function=Function(
- arguments='{"order_id": "order_12345"}',
- name='get_delivery_date'
- ),
- type='function'
- ),
- ChatCompletionMessageToolCall(
- id=None,
- function=Function(
- arguments='{"location": "LA", "unit": "Fahrenheit"}',
- name='get_current_conditions'
- ),
- type='function'
- )
- ]
- )
- )
- ],
- created=1730205975,
- model='_',
- object='chat.completion',
- service_tier=None,
- system_fingerprint='_',
- usage=CompletionUsage(
- completion_tokens=47,
- prompt_tokens=568,
- total_tokens=615,
- completion_tokens_details=None,
- prompt_tokens_details=None
- )
-)
-```
-
-## Configuring function calling behavior using the tool_choice parameter
-
-User can set `tool_choice=none` to disable function calling even if the tools are provided
+You can control function calling behavior using the `tool_choice` parameter:
-```
+```python
+# Disable function calling
response = client.chat.completions.create(
- top_p=0.9,
- temperature=0.6,
model=MODEL,
- messages= messages, #completion_payload["messages"],
+ messages=messages,
tools=tools,
tool_choice="none"
)
-```
-
-User can also force model to call a tool by specify the tool name, in this example it's the `get_current_conditions`
-```
+# Force specific function
response = client.chat.completions.create(
- top_p=0.9,
- temperature=0.6,
model=MODEL,
- messages= [{"role": "user", "content": "Hi, can you tell me the delivery date for my order order_12345 and check the weather condition in LA?"}],
+ messages=messages,
tools=tools,
- tool_choice= {"type": "function", "function": {"name": "get_current_conditions"}})
-
+ tool_choice={"type": "function", "function": {"name": "get_current_conditions"}}
+)
```
-User can also specify the function with enum field to the tool definition to make model generate more accurate.
+### Enhanced Function Definitions
-```
+Use enums to improve function accuracy:
+
+```json
{
"name": "pick_tshirt_size",
- "description": "Call this if the user specifies which size t-shirt they want",
+ "description": "Handle t-shirt size selection",
"parameters": {
"type": "object",
"properties": {
"size": {
"type": "string",
"enum": ["s", "m", "l"],
- "description": "The size of the t-shirt that the user would like to order"
+ "description": "T-shirt size selection"
}
},
- "required": ["size"],
- "additionalProperties": false
+ "required": ["size"]
}
}
```
-(*) Note that the accuracy of function calling heavily depends on the quality of the model. For small models like 8B or 12B, we should only use function calling with simple cases.
-
- The function calling feature from cortex.cpp is primarily an application of prompt engineering. When tools are specified, we inject a system prompt into the conversation to facilitate this functionality.
-
- Compatibility: This feature works best with models like llama3.1 and its derivatives, such as mistral-nemo or qwen.
-
- Customization: Users have the option to manually update the system prompt to fine-tune it for specific problems or use cases. The detail implementation is in this [PR](https://github.com/janhq/cortex.cpp/pull/1472/files).
+## Important Notes
- The full steps to mimic the function calling feature in Python using openai lib can be found [here](https://github.com/janhq/models/issues/16#issuecomment-2381129322).
+- Function calling accuracy depends on model quality. Smaller models (8B-12B) work best with simple use cases.
+- Cortex.cpp implements function calling through prompt engineering, injecting system prompts when tools are specified.
+- Best compatibility with llama3.1 and derivatives (mistral-nemo, qwen)
+- System prompts can be customized for specific use cases (see [implementation details](https://github.com/janhq/cortex.cpp/pull/1472/files))
+- For complete implementation examples, refer to our [detailed guide](https://github.com/janhq/models/issues/16#issuecomment-2381129322)
diff --git a/docs/docs/installation.mdx b/docs/docs/installation.mdx
index 68de8e0f7..acee4d5d0 100644
--- a/docs/docs/installation.mdx
+++ b/docs/docs/installation.mdx
@@ -8,13 +8,8 @@ import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import Admonition from '@theme/Admonition';
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in our codebase.
-:::
-
-## Cortex.cpp Installation
-
### Cortex.cpp offers four installer types
+
- **Network Installers** download a minimal script and require an internet connection to fetch packages during installation.
- **Local Installers** include all necessary packages, enabling offline installation without internet access.
- **Dockerfile** Installers are used to build a Docker image with Cortex ready to go.
@@ -46,7 +41,7 @@ For other versions, please look at [cortex.cpp repo](https://github.com/janhq/co
### OS
- MacOS 12 or later
- Windows 10 or later
-- Linux: Ubuntu 20.04 or later, Debian 11 or later, and any of the latest versions of Arch (for other distributions,
+- Linux: Ubuntu 20.04 or later, Debian 11 or later, and any of the latest versions of Arch (for other distributions,
please use the Dockerfile installer or binary files, we have not tested on other distributions yet.)
### Hardware
diff --git a/docs/docs/installation/docker.mdx b/docs/docs/installation/docker.mdx
index 821298300..ffc485962 100644
--- a/docs/docs/installation/docker.mdx
+++ b/docs/docs/installation/docker.mdx
@@ -7,10 +7,6 @@ import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import Admonition from '@theme/Admonition';
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended
-behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
## Getting Started with Cortex on Docker
diff --git a/docs/docs/installation/linux.mdx b/docs/docs/installation/linux.mdx
index 8c4afc076..a45c9cefe 100644
--- a/docs/docs/installation/linux.mdx
+++ b/docs/docs/installation/linux.mdx
@@ -8,18 +8,14 @@ import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import Admonition from '@theme/Admonition';
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended
-behavior of Cortex, which may not yet be fully implemented in the codebase yet.
-:::
-## Cortex.cpp Installation
:::info
Before installation, make sure that you have met the [minimum requirements](/docs/installation#minimum-requirements) to run Cortex.
This instruction is for stable releases. For beta and nightly releases, please replace `cortex` with `cortex-beta` and `cortex-nightly`, respectively.
:::
### Prerequisites
+
- OpenMPI
- curl
- jq
@@ -52,7 +48,9 @@ This instruction is for stable releases. For beta and nightly releases, please r
```
### Data Folder
+
By default, Cortex.cpp is installed in the following directory:
+
```
# Binary Location
/usr/bin/cortex
@@ -66,6 +64,7 @@ By default, Cortex.cpp is installed in the following directory:
```
## Uninstall Cortex.cpp
+
```bash
# Stable version
sudo /usr/bin/cortex-uninstall.sh
diff --git a/docs/docs/installation/mac.mdx b/docs/docs/installation/mac.mdx
index 9f3dfef82..b1e8b5e2b 100644
--- a/docs/docs/installation/mac.mdx
+++ b/docs/docs/installation/mac.mdx
@@ -7,11 +7,6 @@ slug: 'mac'
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
-
-## Cortex.cpp Installation
:::info
Before installation, make sure that you have met the [minimum requirements](/docs/installation#minimum-requirements) to run Cortex.
The instructions below are for stable releases only. For beta and nightly releases, please replace `cortex` with `cortex-beta` and `cortex-nightly`, respectively.
@@ -117,7 +112,7 @@ The script requires sudo permission.
## Update Cortex
-Cortex can be updated in-place without any additional scripts. In addition, cortex will let you know if there is a new version of itself the next
+Cortex can be updated in-place without any additional scripts. In addition, cortex will let you know if there is a new version of itself the next
time you start a server.
:::info
@@ -126,4 +121,4 @@ The script requires sudo permission.
```bash
sudo cortex update
-```
\ No newline at end of file
+```
diff --git a/docs/docs/installation/windows.mdx b/docs/docs/installation/windows.mdx
index a5c2c2d86..f49fe2c78 100644
--- a/docs/docs/installation/windows.mdx
+++ b/docs/docs/installation/windows.mdx
@@ -8,20 +8,15 @@ import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import Admonition from '@theme/Admonition';
-:::warning
-🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior of
-Cortex, which may not yet be fully implemented in the codebase.
-:::
-## Overview
-For Windows, Cortex.cpp can be installed in two ways, by downloading the [windows](#windows) installer or
+For Windows, Cortex.cpp can be installed in two ways, by downloading the [windows](#windows) installer or
via the [Windows Subsystem for Linux (WSL)](#windows-subsystem-linux).
## Windows
### Install Cortex.cpp
:::info
Before installation, make sure that you have met the [minimum requirements](/docs/installation#minimum-requirements) to run Cortex.
-The instructions below are for stable releases only. For beta and nightly releases, please replace `cortex` with `cortex-beta`
+The instructions below are for stable releases only. For beta and nightly releases, please replace `cortex` with `cortex-beta`
and `cortex-nightly`, respectively.
:::
@@ -63,7 +58,7 @@ To uninstall Cortex.cpp:
## Windows Subsystem Linux
:::info
-Windows Subsystem Linux allows running Linux tools and workflows seamlessly alongside Windows applications. For more
+Windows Subsystem Linux allows running Linux tools and workflows seamlessly alongside Windows applications. For more
information, please see this [article](https://learn.microsoft.com/en-us/windows/wsl/faq).
:::
@@ -104,4 +99,4 @@ Follow the [linux installation steps](linux) to install Cortex.cpp on the WSL.
## Update cortex to latest version
```bash
cortex.exe update
-```
\ No newline at end of file
+```
diff --git a/docs/docs/overview.mdx b/docs/docs/overview.mdx
index 0dcabe41f..4a00b55ba 100644
--- a/docs/docs/overview.mdx
+++ b/docs/docs/overview.mdx
@@ -8,18 +8,9 @@ import OAICoverage from "@site/src/components/OAICoverage"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-# Cortex
-
-:::info
-**Real-world Use**: Cortex.cpp powers [Jan](https://jan.ai), our on-device ChatGPT-alternative.
-
-Cortex.cpp is in active development. If you have any questions, please reach out to us on [GitHub](https://github.com/janhq/cortex.cpp/issues/new/choose)
-or [Discord](https://discord.com/invite/FTk2MvZwJH)
-:::
-

-Cortex is a Local AI API Platform that is used to run and customize LLMs.
+Cortex is the open-source brain for robots: vision, speech, language, tabular, and action -- the cloud is optional.
Key Features:
- Straightforward CLI (inspired by Ollama)
diff --git a/docs/docs/quickstart.mdx b/docs/docs/quickstart.mdx
index c8e325a44..c965a342a 100644
--- a/docs/docs/quickstart.mdx
+++ b/docs/docs/quickstart.mdx
@@ -8,12 +8,6 @@ import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::info
-Cortex.cpp is in active development. If you have any questions, please reach out to us:
-- [GitHub](https://github.com/janhq/cortex.cpp/issues/new/choose)
-- [Discord](https://discord.com/invite/FTk2MvZwJH)
-:::
-
## Local Installation
Cortex has a **Local Installer** with all of the required dependencies, so that once you've downloaded it, no internet connection is required during the installation process.
diff --git a/docs/docs/requirements.mdx b/docs/docs/requirements.mdx
index 7c13ab772..fef3915ff 100644
--- a/docs/docs/requirements.mdx
+++ b/docs/docs/requirements.mdx
@@ -7,10 +7,6 @@ import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import Admonition from '@theme/Admonition';
-:::warning
-🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
-
To run LLMs on-device or on-premise, Cortex has the following requirements:
## Hardware Requirements
@@ -42,7 +38,7 @@ To run LLMs on-device or on-premise, Cortex has the following requirements:
- 8GB for running up to 3B models (int4).
- 16GB for running up to 7B models (int4).
- 32GB for running up to 13B models (int4).
-
+
We support DDR2 RAM as the minimum requirement but recommend using newer generations of RAM for improved performance.
@@ -50,13 +46,13 @@ To run LLMs on-device or on-premise, Cortex has the following requirements:
- 6GB can load the 3B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
- 8GB can load the 7B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
- 12GB can load the 13B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
-
+
Having at least 6GB VRAM when using NVIDIA, AMD, or Intel Arc GPUs is recommended.
- Having at least 10GB is recommended.
-
+
The app is 1.02 MB, but models are usually 4GB+.
@@ -116,7 +112,7 @@ To run LLMs on-device or on-premise, Cortex has the following requirements:
- [NVIDIA driver](https://www.nvidia.com/Download/index.aspx) version 470.63.01 or higher.
- [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit) version 12.3 or higher.
-
+
CUDA Toolkit dependencies will be installed when you install Cortex.
@@ -137,4 +133,4 @@ To run LLMs on-device or on-premise, Cortex has the following requirements:
-
\ No newline at end of file
+
diff --git a/docs/docs/telemetry.mdx b/docs/docs/telemetry.mdx
index 602449978..33c8abef8 100644
--- a/docs/docs/telemetry.mdx
+++ b/docs/docs/telemetry.mdx
@@ -7,10 +7,6 @@ slug: "telemetry"
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
-:::warning
-🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
-
Cortex collects telemetry data to enhance our product. This data provides detailed insights into your usage, including crash reports for your Cortex or Jan applications. By analyzing this information, we can identify and fix bugs, optimize performance, and improve overall stability and user experience.
:::info
We do not collect any sensitive or personal information.
@@ -22,7 +18,7 @@ cortex telemetry crash
## Dataflow
To understand how our telemetry system operates and how data flows from your hardware into our system, please refer to the [Telemetry architecture](/docs/telemetry-architecture).
## Telemetry Metrics
-The collected telemetry metrics for Cortex are divided into two main categories:
+The collected telemetry metrics for Cortex are divided into two main categories:
- `CrashReportResource`
- `CrashReportPayload`
@@ -115,4 +111,4 @@ This category focuses on metrics related to specific operations within Cortex. I
:::info
Learn more about Telemetry:
- [Telemetry CLI command](/docs/cli/telemetry).
-:::
\ No newline at end of file
+:::
diff --git a/docs/static/img/social-card-old.jpg b/docs/static/img/social-card-old.jpg
new file mode 100644
index 000000000..cad56cc6f
Binary files /dev/null and b/docs/static/img/social-card-old.jpg differ
diff --git a/docs/static/img/social-card.jpg b/docs/static/img/social-card.jpg
index cad56cc6f..d2114111d 100644
Binary files a/docs/static/img/social-card.jpg and b/docs/static/img/social-card.jpg differ