Immich ML Proxy: A Practical Tool for Immich ML Task Routing & Debugging #24866

yuhuan417 · 2025-12-27T05:46:16Z

yuhuan417
Dec 27, 2025

I’d like to share a small tool I built while debugging Immich’s machine learning features—Immich ML Proxy. https://github.com/yuhuan417/immich_ml_proxy

It’s a lightweight proxy service tailored for Immich ML, ideal for scenarios like distributing different model tasks to separate backends (reducing single-server memory load) and leveraging round-robin across multiple backends to boost throughput. Its core capability lies in routing specific task types (e.g., clip, facial_recognition, ocr) to configured backend servers. Additionally, it provides comprehensive debugging features, enabling capture and inspection of detailed HTTP request and response data to simplify workflow troubleshooting.

Dual-0 · 2025-12-30T15:18:11Z

Dual-0
Dec 30, 2025

This looks like a great tool for those wanting to split logic by task type!

I’ve been working on a similar challenge but took a different route by leveraging standard high-performance networking via Nginx to build a "Hybrid-Cloud" NPU farm. My setup offloads Immich ML tasks from Immich Server within a Kubernetes Cluster on Hetzner Cloud to a local cluster of RockPi 5 SBCs using their 6-TOPS NPUs (RKNN).

The goal was to have a "self-healing" cluster that uses standard L7 load balancing and strict SSL verification. Here is the Nginx configuration I use to handle the distribution, including non-idempotent retries (crucial for ensuring a POST request/image doesn't fail if one SBC goes down).

user nginx;
worker_processes auto;

# Send errors to stderr
error_log /dev/stderr warn;
pid /var/run/nginx.pid;

events {
    worker_connections 1024;
}

http {
    # ML Performance Tuning
    client_max_body_size 1G;        # Allow large video frames/images
    proxy_request_buffering on;     # Keep a copy of the image to retry if a node fails.
    client_body_buffer_size 64M;    # Keep images up to 64MB in RAM for speed

    # Custom log format to track which SBC handles the request
    log_format upstream_debug '$remote_addr - $remote_user [$time_local] '
                              '"$request" $status $body_bytes_sent '
                              'backend: $upstream_addr '        # SBC IP/Hostname
                              'status: $upstream_status '       # HTTP status
                              'rt: $upstream_response_time';    # Processing time

    # Direct access logs to stdout
    access_log /dev/stdout upstream_debug;

    upstream immich_ml_backends {
        # Sends tasks to nodes with fewest active ML jobs.
        least_conn;
        # Failover: Aggressive removal (1 error = 60s cooldown) to keep the pipeline moving.
        server immich-ml.rock1.internal:443 max_fails=1 fail_timeout=60s;
        server immich-ml.rock2.internal:443 max_fails=1 fail_timeout=60s;
        server immich-ml.rock3.internal:443 max_fails=1 fail_timeout=60s;
        server immich-ml.rock4.internal:443 max_fails=1 fail_timeout=60s;
    }

    server {
        listen 3003;

        location / {
            proxy_pass https://immich_ml_backends;
            
            # TLS and SSL Settings
            proxy_ssl_verify on;
            proxy_ssl_trusted_certificate /etc/nginx/certs/ca.crt;
            proxy_ssl_verify_depth 2;
            # Validates all nodes against one 'Cluster Name'
            proxy_ssl_name "immich-ml.internal"; 
            proxy_ssl_server_name on;

            # Ensures the backend Nginx/App recognizes the request.
            proxy_set_header Host "immich-ml.internal";
            
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            # Failover logic
            proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504 non_idempotent;
            proxy_next_upstream_tries 3;

            # Timeouts
            proxy_connect_timeout 5s;
            proxy_read_timeout 60s;
        }
    }
}

Key benefits of this approach:

Hardware Acceleration: Direct access to the Rockchip NPU on the backends.
Zero-Downtime: If one RockPi fails, Nginx immediately resends the image to another node. The Immich server only sees a slight delay, not a failure.
Identity: Using a Shared SAN (immich-ml.internal) allows for strict TLS verification across the whole farm.

It's cool to see different ways to solve the scaling problem - whether through custom proxies or hardened infrastructure!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Immich ML Proxy: A Practical Tool for Immich ML Task Routing & Debugging #24866

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Immich ML Proxy: A Practical Tool for Immich ML Task Routing & Debugging #24866

Uh oh!

yuhuan417 Dec 27, 2025

Replies: 1 comment

Uh oh!

Dual-0 Dec 30, 2025

yuhuan417
Dec 27, 2025

Dual-0
Dec 30, 2025