Immich ML Proxy

A proxy service for Immich ML with support for multi-backend routing, type-based task distribution, health monitoring, and comprehensive debugging capabilities.

Features

Multi-backend Support: Configure multiple Immich ML backend servers
Type-based Routing: Automatically route requests to different backends based on type (e.g., clip, facial_recognition, ocr)
Round-robin Load Balancing: Distribute requests across multiple healthy backends for the same type
Health Monitoring: Continuous health checking with automatic failover
Concurrent Processing: Process multiple types in parallel for improved performance
Web Configuration UI: Simple web interface for managing backends and routing with real-time health status
Debug Mode: Comprehensive request/response logging and debugging tools
Request Recording: Capture and inspect incoming and outgoing HTTP requests/responses

API Endpoints

GET /

Returns a simple web page with links to the configuration and debug interfaces.

GET /ping

Checks the health status of all configured backends and verifies that each type has at least one healthy backend.

Behavior:

Checks health of all backends in parallel by calling their /ping endpoint
Updates health status for each backend based on response
Verifies that the default backend is healthy (handles all non-routed types)
Verifies that each type in taskRouting has at least one healthy backend

Response:

Returns "pong" with HTTP 200 if:
- Default backend is healthy
- Every type in taskRouting has at least one healthy backend
Returns HTTP 503 (Service Unavailable) if:
- No backends are configured
- Default backend is unhealthy
- Any type in taskRouting lacks healthy backends

POST /predict

Routes inference requests to appropriate backends based on type. Groups entries by type and processes them concurrently with health-aware round-robin load balancing.

Request Parameters:

entries: JSON string containing task configuration with nested structure
- Format: {"taskName": {"type": config, ...}}
image: Image file (optional, multipart form data)
text: Text content (optional, multipart form data)

Behavior:

Parses entries and groups them by type (not task)
For each type:
- Gets healthy backends for that type from taskRouting
- If no type-specific routing configured, uses default backend
- Uses round-robin to select backend from healthy backends
- If no healthy backends available, falls back to all backends
- Forwards request to selected backend
- Updates backend health status based on response (200 = healthy, other = unhealthy)
Processes all types concurrently for better performance
Merges results from all types and returns them in the original order

Health Status Updates:

Backend marked as healthy: Returns HTTP 200
Backend marked as unhealthy: Returns non-200 status or connection error

Response: JSON object with results from all types

GET /config

Returns the web configuration interface.

GET /api/config

Returns current configuration in JSON format.

GET /api/health

Returns health status of all backends in real-time.

Response:

{
  "backend1": {
    "status": "healthy",
    "lastCheck": 1735278000,
    "error": ""
  },
  "backend2": {
    "status": "unhealthy",
    "lastCheck": 1735278010,
    "error": "connection refused"
  }
}

Status Values:

healthy: Backend is responding correctly
unhealthy: Backend is not responding or returning errors
unknown: Health status not yet checked

POST /api/config

Saves configuration.

Request Body:

{
  "defaultBackend": "backend1",
  "backends": [
    {
      "name": "backend1",
      "url": "http://localhost:3003"
    },
    {
      "name": "backend2",
      "url": "http://localhost:3004"
    }
  ],
  "taskRouting": {
    "facial_recognition": "backend1",
    "search": "backend2"
  }
}

GET /debug

Returns the debug monitoring interface.

GET /api/debug/status

Returns current debug status.

Response:

{
  "enabled": true,
  "maxRecords": 100,
  "recordCount": 42
}

POST /api/debug/toggle

Enables or disables debug mode.

Request Body:

{
  "enabled": true
}

POST /api/debug/max-records

Sets the maximum number of debug records to keep (1-10000).

Request Body:

{
  "maxRecords": 500
}

GET /api/debug/records

Returns all debug records (incoming and outgoing HTTP requests/responses).

DELETE /api/debug/records

Clears all debug records.

Configuration

Configuration is saved in config.json:

{
  "defaultBackend": "backend1",
  "backends": [
    {
      "name": "backend1",
      "url": "http://localhost:3003"
    },
    {
      "name": "backend2",
      "url": "http://localhost:3004"
    }
  ],
  "taskRouting": {
    "clip": "backend2",
    "facial_recognition": "backend1"
  }
}

Configuration Fields:

defaultBackend: Name of the backend that handles all types not in taskRouting
backends: List of backend servers with name and URL
taskRouting: Maps type names to backend names (e.g., clip → backend2)

Type Routing:

Types defined in taskRouting are routed to their specific backends
All other types are routed to the defaultBackend
Health checks verify that both default backend and routed types have healthy backends

Running

# Install dependencies
go mod download

# Run the service (production mode)
go run main.go

# Run the service with debug mode enabled
go run main.go --debug

The service listens on port :3004 by default.

Usage Example

Basic Setup

Start the service:
```
go run main.go
```
Visit http://localhost:3004/config to configure backends
Add backend servers and configure task routing
Save configuration

Making Predictions

Send a POST request to http://localhost:3004/predict with multipart form data:

# Request for facial_recognition type (routed to backend1)
curl -X POST http://localhost:3004/predict \
  -F "entries={\"facial_recognition\": {\"image\": {}}}" \
  -F "[email protected]"

# Request for clip type (routed to backend2)
curl -X POST http://localhost:3004/predict \
  -F "entries={\"clip\": {\"image\": {}}}" \
  -F "[email protected]"

# Request for unknown type (routed to defaultBackend)
curl -X POST http://localhost:3004/predict \
  -F "entries={\"ocr\": {\"image\": {}}}" \
  -F "[email protected]"

Health Monitoring

Check overall health:
```
curl http://localhost:3004/ping
```
- Returns "pong" if all types have healthy backends
- Returns 503 if any backend is unhealthy
View individual backend health status:
```
curl http://localhost:3004/api/health
```
- Returns health status for all backends with timestamps and error details
Monitor health in real-time:
- Visit http://localhost:3004/config to see live health status for each backend
- Health status refreshes every 5 seconds automatically

Debugging

Enable debug mode:
- Visit http://localhost:3004/debug and click "Enable Debug"
- Or use the API: POST /api/debug/toggle with {"enabled": true}
Make some requests
View recorded requests/responses at http://localhost:3004/debug
Clear records when needed: DELETE /api/debug/records

Project Structure

immich_ml_proxy/
├── main.go              # Main entry point
├── config/
│   └── config.go        # Configuration management (singleton pattern)
├── proxy/
│   └── proxy.go         # Proxy logic and request forwarding
├── handlers/
│   ├── handlers.go      # Main HTTP handlers
│   └── debug.go         # Debug-related handlers
├── debug/
│   └── debug.go         # Debug manager for request/response recording
└── static/
    ├── config.html      # Web configuration interface
    └── debug.html       # Debug monitoring interface

Architecture

Configuration: Thread-safe singleton configuration manager with file persistence and health status tracking
Proxy: Handles request parsing, type-based grouping, round-robin load balancing, and concurrent forwarding to backends
Health Monitoring: Continuous health checking with automatic status updates and failover logic
Handlers: HTTP endpoint handlers for configuration, prediction, health monitoring, and debugging
Debug: Comprehensive request/response recording with configurable retention
Middleware: Debug middleware that captures all HTTP traffic when enabled

Routing Logic:

Parse request entries and group by type
For each type, look up routing in taskRouting
If no routing found, use defaultBackend
Select backend using round-robin from healthy backends
If no healthy backends, fall back to all backends
Forward request and update health status based on response

Health Check Logic:

Check all backends in parallel via /ping endpoint
Verify defaultBackend is healthy (required for non-routed types)
Verify each type in taskRouting has at least one healthy backend
Return healthy only if all conditions are met

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Immich ML Proxy

Features

API Endpoints

GET /

GET /ping

POST /predict

GET /config

GET /api/config

GET /api/health

POST /api/config

GET /debug

GET /api/debug/status

POST /api/debug/toggle

POST /api/debug/max-records

GET /api/debug/records

DELETE /api/debug/records

Configuration

Running

Usage Example

Basic Setup

Making Predictions

Health Monitoring

Debugging

Project Structure

Architecture

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
config		config
debug		debug
handlers		handlers
proxy		proxy
static		static
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

yuhuan417/immich_ml_proxy

Folders and files

Latest commit

History

Repository files navigation

Immich ML Proxy

Features

API Endpoints

GET /

GET /ping

POST /predict

GET /config

GET /api/config

GET /api/health

POST /api/config

GET /debug

GET /api/debug/status

POST /api/debug/toggle

POST /api/debug/max-records

GET /api/debug/records

DELETE /api/debug/records

Configuration

Running

Usage Example

Basic Setup

Making Predictions

Health Monitoring

Debugging

Project Structure

Architecture

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages