feat: auto downloading model given in modelfile #16

madclaws · 2025-12-18T23:36:14Z

No description provided.

coderabbitai · 2025-12-18T23:36:22Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This pull request introduces a model download feature across the Python API server and Rust client. It adds a /download endpoint to the API server that accepts model names, implements a new hf_downloader module to orchestrate HuggingFace model downloads, and introduces a separate worker process (throttled_download_worker.py) that performs bandwidth-throttled downloads with adaptive rate limiting. The Rust client's load_model function is modified to detect missing models (404 responses) and trigger downloads via the new endpoint. The huggingface-hub library dependency is added to pyproject.toml.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant API Server
    participant hf_downloader
    participant Worker Process
    participant HuggingFace Hub
    
    Client->>API Server: POST /download {model}
    activate API Server
    API Server->>hf_downloader: pull_model(model_spec)
    activate hf_downloader
    
    alt Model in cache
        hf_downloader->>hf_downloader: Return cached path
    else Model not cached
        hf_downloader->>hf_downloader: Validate model spec & prompt if needed
        hf_downloader->>Worker Process: Spawn with throttle config (via subprocess)
        activate Worker Process
        
        Worker Process->>Worker Process: Monkeypatch requests.get/post
        Worker Process->>HuggingFace Hub: snapshot_download (with throttled requests)
        rect rgba(100, 200, 150, 0.3)
            Note over Worker Process,HuggingFace Hub: Adaptive throttling per file type & size
        end
        
        alt Worker succeeds (exit 0)
            Worker Process->>hf_downloader: Return success
        else Worker fails
            rect rgba(200, 100, 100, 0.3)
                Worker Process->>hf_downloader: Non-zero exit code
            end
            hf_downloader->>HuggingFace Hub: Fallback: Direct throttled snapshot_download
            HuggingFace Hub->>hf_downloader: Model cache
        end
        deactivate Worker Process
    end
    
    hf_downloader->>API Server: Return model path / raise HTTP error
    deactivate hf_downloader
    API Server->>Client: 200 OK {success message} or error
    deactivate API Server

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring extra attention:

server/throttled_download_worker.py — Complex adaptive throttling logic with signal handling, monkeypatching of requests.get/post, and multiple error paths (HTTP 401/403/404, connection errors, OS-level errno handling). Verify correctness of delay calculations, signal handler safety, and cleanup semantics.
server/hf_downloader.py — Subprocess lifecycle management, fallback strategy logic, and extensive error handling chain. Verify that temporary JSON config file cleanup is guaranteed even on exceptions, subprocess niceness settings work as intended, and the fallback to direct download preserves error context correctly.
Monkeypatching implications — The global patching of requests library methods in the worker process could have side effects if other code relies on standard request behavior. Verify isolation and test interoperability.
src/runner/mlx.rs — Integration point between Rust client and Python API. Verify the POST to /download is resilient to network failures and that 404 detection does not mask other HTTP errors.
Environment variable configuration — Both hf_downloader.py and throttled_download_worker.py read environment variables for download parameters (threads, chunk size, transfer mode). Verify consistent defaults and expected ranges.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/model-auto-download

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fab93ce and 0a3f624.

⛔ Files ignored due to path filters (1)

server/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

server/api.py (4 hunks)
server/hf_downloader.py (1 hunks)
server/pyproject.toml (1 hunks)
server/throttled_download_worker.py (1 hunks)
src/runner/mlx.rs (2 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

feat: auto downloading model given in modelfile

0a3f624

madclaws merged commit 2e913ad into main Dec 18, 2025
0 of 2 checks passed

madclaws deleted the feat/model-auto-download branch December 18, 2025 23:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: auto downloading model given in modelfile #16

feat: auto downloading model given in modelfile #16

Uh oh!

madclaws commented Dec 18, 2025

Uh oh!

coderabbitai bot commented Dec 18, 2025 •

edited

Loading

Review failed

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: auto downloading model given in modelfile #16

feat: auto downloading model given in modelfile #16

Uh oh!

Conversation

madclaws commented Dec 18, 2025

Uh oh!

coderabbitai bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Dec 18, 2025 •

edited

Loading