Know exactly which LLM providers are up, which are fastest, and which are degrading — before your users notice.
Modern AI architectures use dozens of LLM providers across services — OpenAI, Anthropic, Bedrock, Vertex, local Ollama, custom endpoints — each with different availability, latency, and throughput characteristics. When providers fail or slow down, you find out from support tickets, not monitoring dashboards. Existing tools are either SaaS-only (expensive, locked-in), infrastructure-focused (can't probe LLM APIs), or require complex instrumentation (changes your code).
| Aspect | Datadog / Langfuse | Prometheus | LLM Overwatch | ArgusLM |
|---|---|---|---|---|
| Deployment | SaaS-only | Self-hosted | SaaS-only | Self-hosted |
| Local Models | ❌ No | ❌ No | ❌ No | ✅ Ollama, LM Studio, local APIs |
| Probing vs Tracing | Tracing only | Infrastructure only | Probing only | Synthetic probing |
| Metrics | Request-level | Node-level | Response time | TTFT, TPS, latency, uptime |
| Pricing | $$$$ | Free | $$$ | ✅ Free & Open-Source |
| Extensible | Limited | Limited | No | ✅ Full Python SDK + HTTP API |
What makes ArgusLM unique: The only open-source tool that actively probes any LLM provider (including local Ollama/LM Studio) for real uptime, Time to First Token (TTFT), Tokens per Second (TPS), and latency — with a unified Python SDK for custom automation.
ArgusLM is for you if:
- You're building production AI systems — Monitor uptime and performance of multiple LLM providers in real-time, detect degradations before users do.
- You run self-hosted LLM deployments — Track local Ollama/LM Studio availability and response metrics alongside cloud providers in one dashboard.
- You provider LLM-based services — Know exactly which provider to route traffic to based on real performance data, not assumptions or marketing claims.
- You need automated benchmarking — Run scheduled comparisons between models (GPT-4 vs Claude vs local Llama) to optimize costs and quality.
- You must keep costs private — Self-hosted, no SaaS lock-in, full control over your observability data.
Deploy ArgusLM in under a minute:
git clone https://github.com/bluet/arguslm.git && cd arguslm
cp .env.example .env
# Generate secrets (requires cryptography package, or use the Docker one-liner in .env.example)
python3 scripts/generate-secrets.py >> .env
docker compose up -dDashboard: http://localhost:3000 API Documentation: http://localhost:8000/docs
| Category | Capabilities |
|---|---|
| Monitoring | Automated uptime checks, real-time status tracking, and configurable availability intervals. |
| Benchmarking | Parallel multi-model testing with deep metrics for TTFT, TPS, and total latency. |
| Visualization | Live performance charts, historical trends, and side-by-side model comparisons. |
| Alerting | Proactive downtime detection and performance degradation notifications. |
| Integration | 90+ providers via LiteLLM (16 tested, all others auto-discovered from LiteLLM catalog). |
ArgusLM is built for scale and reliability, leveraging a modern asynchronous stack.
┌─────────────────────────────────────────────────────────────────┐
│ ArgusLM │
├─────────────────────────────────────────────────────────────────┤
│ Frontend (React + Vite) Backend (FastAPI) │
│ ┌─────────────────────┐ ┌──────────────────────┐ │
│ │ Dashboard │◄─────────►│ REST API + WebSocket │ │
│ │ Benchmarks │ │ Background Scheduler │ │
│ │ Monitoring │ │ Alert Engine │ │
│ │ Providers │ └──────────┬───────────┘ │
│ └─────────────────────┘ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ LiteLLM Abstraction Layer │ │
│ └─────────────┬───────────────┘ │
│ │ │
└────────────────────────────────────────────┼────────────────────┘
▼
┌──────────────────────────────────────────────────┐
│ LLM Providers │
│ OpenAI │ Anthropic │ Bedrock │ Vertex │ Azure │
│ Ollama │ LM Studio │ xAI │ DeepSeek │ 90+ │
└──────────────────────────────────────────────────┘
# Trigger a manual monitoring run
curl -X POST http://localhost:8000/api/v1/monitoring/run
# Get current monitoring configuration
curl http://localhost:8000/api/v1/monitoring/config
# Get uptime history for all providers (last 100 checks)
curl "http://localhost:8000/api/v1/monitoring/uptime?limit=100"# Start benchmark for specific models
curl -X POST http://localhost:8000/api/v1/benchmarks \
-H "Content-Type: application/json" \
-d '{
"model_ids": ["uuid-1", "uuid-2"],
"prompt_pack": "health_check",
"max_tokens": 100,
"num_runs": 5
}'
# List all benchmarks
curl http://localhost:8000/api/v1/benchmarks
# Get results for specific benchmark run
curl http://localhost:8000/api/v1/benchmarks/{run_id}/resultspip install arguslmfrom arguslm import ArgusLMClient
from arguslm.schemas import BenchmarkCreate
with ArgusLMClient(base_url="http://localhost:8000") as client:
# Check provider uptime
uptime = client.get_uptime_history(limit=10)
for check in uptime.items:
print(f"{check.model_name}: {check.status} ({check.ttft_ms}ms TTFT)")
# Run a benchmark
benchmark = client.start_benchmark(BenchmarkCreate(
model_ids=["uuid-1", "uuid-2"],
prompt_pack="shakespeare",
num_runs=3,
))
print(f"Benchmark started: {benchmark.id}")Async support:
from arguslm import AsyncArgusLMClient
async with AsyncArgusLMClient() as client:
config = await client.get_monitoring_config()
providers = await client.list_providers()ArgusLM tracks the metrics that define real-world LLM performance:
- Time to First Token (TTFT): Measure user-perceived responsiveness and cold-start latency.
- Tokens per Second (TPS): Evaluate sustained streaming throughput independent of initial latency.
- End-to-End Latency: Track total request duration for non-streaming workloads.
- Availability: Monitor uptime and reliability trends with granular failure analysis.
Dashboard Screenshots
Real-time tracking of latency and throughput trends across all configured providers.
Side-by-side performance comparison to identify the most efficient models for your workload.
Configure granular monitoring intervals and thresholds for each provider.
Execute standardized benchmark suites to validate provider performance under load.
| Variable | Description | Default |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | postgresql+asyncpg://... |
SECRET_KEY |
Session encryption key | required |
ENCRYPTION_KEY |
Credential encryption (Fernet) | required |
Detailed setup instructions are available in the Configuration Guide.
pip install -e ".[server]"
alembic upgrade head
uvicorn arguslm.server.main:app --reloadcd frontend
npm install
npm run dev| Layer | Technology |
|---|---|
| Backend | FastAPI, Python 3.11+, SQLAlchemy, Alembic |
| Frontend | React 18, TypeScript, Vite, Tailwind CSS, Recharts |
| Database | PostgreSQL (Production) / SQLite (Development) |
| Abstraction | LiteLLM |
# SDK only (lightweight — for querying an ArgusLM instance)
pip install arguslm
# Full server (for self-hosted deployment without Docker)
pip install arguslm[server]Breaking change for docker compose users. v0.3.1 changed the
postgres volume mount path from /var/lib/postgresql/data to
/var/lib/postgresql to match Postgres 18's new layout. If you have an
existing postgres_data volume populated by v0.3.0, postgres 18 will
either refuse to start or initialize a fresh empty cluster — silent data
loss is possible.
Option A — preserve data (recommended):
# 1. With v0.3.0 still running, dump everything
docker compose exec db pg_dumpall -U arguslm > arguslm-backup.sql
# 2. Stop and remove the old volume
docker compose down -v
# 3. Pull v0.3.1+ and start fresh
git pull && docker compose pull && docker compose up -d
# 4. Wait for db healthcheck, then restore
docker compose exec -T db psql -U arguslm < arguslm-backup.sqlOption B — fresh install (loses all data):
docker compose down -v
git pull && docker compose pull && docker compose up -dOption C — stay on Postgres 17 (simplest if you don't need v18):
Edit docker-compose.yml and pin image: postgres:17-alpine and revert
the volume to postgres_data:/var/lib/postgresql/data. No data migration
needed.
- Architecture Overview
- Python SDK Guide
- REST API Reference
- Configuration Guide
- Troubleshooting
- Comparison with Alternatives
- Interactive API Docs (Swagger UI, available when server is running)
We welcome contributions from the community. Please review our Contributing Guidelines before submitting a Pull Request.
Matthew (BlueT) Lien
ArgusLM is released under the Apache License 2.0.
Named after Argus Panoptes, the all-seeing giant of Greek mythology.
