Skip to content

Latest commit

 

History

History
98 lines (66 loc) · 4.21 KB

File metadata and controls

98 lines (66 loc) · 4.21 KB

Minimum System Requirements for NVIDIA RAG Blueprint

This documentation contains the system requirements for the NVIDIA RAG Blueprint.

:::{important} You can deploy the RAG Blueprint with Docker, Helm, or NIM Operator, and target dedicated hardware or a Kubernetes cluster. Some requirements are different depending on your target system and deployment method. :::

Disk Space Requirements

:::{important} Ensure that you have at least 200GB of available disk space before you deploy the RAG Blueprint. This space is required for the following:

  • NIM model downloads and caching (largest component, ~100-150GB)
  • Container images (~20-30GB)
  • Vector database data and indices
  • Application logs and temporary files

Insufficient disk space causes deployment failures during model downloads or runtime operations. :::

Operating System

For the RAG Blueprint you need the following operating system:

  • Ubuntu 22.04 OS

Driver Versions

For the RAG Blueprint you need the following drivers:

  • GPU Driver - 560 or later
  • CUDA version - 12.9 or later

For details, see NVIDIA NIM for LLMs Software.

Hardware Requirements (Docker)

By default, the RAG Blueprint deploys the NIM microservices locally (self-hosted). You need one of the following:

  • 2 x H100
  • 2 x B200
  • 3 x A100 SXM
  • 2 x RTX PRO 6000

:::{tip} You can also modify the RAG Blueprint to use NVIDIA-hosted NIM microservices. :::

:::{tip} No GPU Available? Try the Containerless Deployment (Lite Mode) which requires no GPU hardware and uses NVIDIA cloud APIs for all processing. :::

Hardware Requirements (Kubernetes)

To install the RAG Blueprint on Kubernetes, you need one of the following:

  • 8 x H100-80GB
  • 8 x B200
  • 9 x A100-80GB SXM
  • 8 x RTX PRO 6000
  • 3 x H100 (with Multi-Instance GPU)

Hardware requirements for self-hosting all NVIDIA NIM microservices

The following are requirements and recommendations for the individual components of the RAG Blueprint:

  • Pipeline operation – 1x L40 GPU or similar recommended. This is needed for the Milvus vector database, as GPU acceleration is enabled by default.
  • LLM NIM (llama-3.3-nemotron-super-49b-v1.5) – Refer to the Support Matrix.
  • Embedding NIM (Llama-3.2-NV-EmbedQA-1B-v2 ) – Refer to the Support Matrix.
  • Reranking NIM (llama-3_2-nv-rerankqa-1b-v2 ): Refer to the Support Matrix.
  • NeMo Retriever OCR (Default): Refer to the Support Matrix.
  • NVIDIA NIMs for Object Detection:

:::{tip} NeMo Retriever OCR is now the default OCR service. To use the legacy Paddle OCR instead, see OCR Configuration Guide. :::

Related Topics