Update setup instructions to match the revised docker-compose structure

ajeetraina · ajeetraina · commit f0a4df1aa915 · 2025-05-15T22:10:45.000+05:30
diff --git a/SETUP.md b/SETUP.md
@@ -5,7 +5,7 @@ This document provides step-by-step instructions for setting up the RAG-enhanced
 ## Prerequisites
 
 - Docker and Docker Compose
-- Docker's Model Runner with llama3.2 model
+- Docker model pull permissions for llama3.2 model
 
 ## Setup Steps
 
@@ -17,36 +17,49 @@ This document provides step-by-step instructions for setting up the RAG-enhanced
    docker model pull ai/llama3.2:1B-Q8_0
    ```
 
-2. **Start the Model Runner**
+2. **Start all services including model-runner**
 
-   Run the model using Docker's Model Runner functionality:
+   Start the application with one command:
 
    ```bash
-   docker model run ai/llama3.2:1B-Q8_0 --port 12434 --server
+   docker compose -f docker-compose.rag.yml up -d --build
    ```
 
-   This exposes the model on port 12434.
-
-3. **Start the RAG application**
+3. **Run the model separately**
 
-   In a separate terminal, start the RAG application using:
+   After the containers are up, you need to start the model separately because the model-runner container doesn't execute the model by default:
 
    ```bash
-   docker compose -f docker-compose.rag.yml up -d --build
+   # First, find the container ID
+   docker ps | grep model-runner
+
+   # Then exec into the container and run the model
+   docker exec -it CONTAINER_ID /bin/bash
+
+   # Once inside the container, run:
+   model run --server --port 12434
    ```
 
+   Leave this terminal open to keep the model running.
+
 4. **Access the application**
 
-   Once all containers are running, access the application at:
+   Once all containers are running and the model is active, access the application at:
    
    - Frontend: [http://localhost:3000](http://localhost:3000)
    - Grafana: [http://localhost:3001](http://localhost:3001) (admin/admin)
    - Jaeger: [http://localhost:16686](http://localhost:16686)
 
 ## Troubleshooting
 
-- **Connection issues to Model Runner**: Make sure the model is running with `docker model ls` and that it's accessible on port 12434
+- **Model runner errors**: If you have issues starting the model inside the container, you can try running it on your host system:
+  ```bash
+  docker model run ai/llama3.2:1B-Q8_0 --port 12434 --server
+  ```
+  Then update `BASE_URL` in docker-compose.rag.yml to: `http://host.docker.internal:12434/engines/llama.cpp/v1/`
+
 - **Backend connectivity issues**: Check logs with `docker compose -f docker-compose.rag.yml logs backend`
+
 - **Vector database errors**: Verify ChromaDB is running with `docker compose -f docker-compose.rag.yml logs vectordb`
 
 ## Architecture