Skip to content

Latest commit

 

History

History
79 lines (47 loc) · 6.61 KB

File metadata and controls

79 lines (47 loc) · 6.61 KB

Get Started with NVIDIA AI Workbench

The AI-Q NVIDIA Research Assistant Blueprint allows you to create a deep research assistant that can run on-premise, allowing anyone to create detailed research reports using on-premise data and web search.

Note This Blueprint runs in NVIDIA AI Workbench. It is a free, lightweight developer platform that you can run on your own systems to get up and running with complex AI applications and workloads quickly.

You may want to fork this repository into your own account before proceeding. Otherwise you won't be able to fully push any changes you make because this NVIDIA-owned repository is read-only.

Navigating the README: Project Overview | Get Started | Customize | License

Other Resources: ⬇️ Download AI Workbench | 📖 User Guide |📂 Other Projects | 🚨 Developer Forum

Project Overview

The main research agent is written in LangGraph and managed using NVIDIA NeMo Agent Toolkit. The research agent provides a unique deep research capability with these features:

  • Deep Research: Given a report topic and desired report structure, an agent will do the following:

    1. Create a report plan
    2. Search data sources for answers
    3. Write a report
    4. Reflect on gaps in the report for further queries
    5. Complete a report with a list of sources
  • Parallel Search: During the research phase, multiple research questions are searched in parallel. For each query, the RAG service is consulted and an LLM-as-a-judge is used to check the relevancy of the results. If more information is needed, a fallback web search is performed. This search approach ensures internal documents are given preference over generic web results while maintaining accuracy. Performing query search in parallel allows for many data sources to be consulted in an efficient manner.

  • Human-in-the-loop: Human feedback on the report plan, interactive report edits, and Q&A with the final report.

  • Data Sources: Integration with the NVIDIA RAG blueprint to search multimodal documents with text, charts, and tables. Optional web search through Tavily.

  • Demo Web Application: Frontend web application showcasing end-to-end use of the AI-Q Research Assistant.

Read More

Start Using the Deep Research Agent With NVIDIA AI Workbench

Ensure you meet the prerequisites for this Blueprint (details).

Before you begin, run the RAG Blueprint. This deep researcher will wrap around your RAG pipeline.

  1. Open NVIDIA AI Workbench. Select a Location to work in.

  2. Clone the project using the repository URL: https://github.com/NVIDIA-AI-Blueprints/aiq-research-assistant.

  3. On the Project Dashboard, resolve the yellow unconfigured secrets warning:

    • NVIDIA_API_KEY: NVIDIA API Key generated from build.nvidia.com or NGC (starts with "nvapi-")
    • TAVILY_API_KEY: Tavily API Key for web search tool calling (starts with "tvly-")
    • RAG_INGEST_URL: Accessible location of running RAG Blueprint, eg. 10.123.45.678
  4. On the Project Dashboard, select the aira-no-gpu compose profile from the dropdown under the Compose section.

    • Alternatively, select the aira-gpu option to start an additional LLM NIM for report generation; 2x GPUs required.
    • (Optional) Select load-default-files to upload the default documents into the database.
  5. On the Project Dashboard, select Start under the Compose section. The compose services may take several minutes to pull and build.

  6. When the compose services are ready, you can access the frontend on the IP address, eg. http://<ip_addr>:3000.

    • If ERR_CONNECTION_REFUSED, try restarting the frontend container: Environment > Compose > aira-frontend > restart.
  7. You can now interact with the deep research agent through its browser interface.

Customization

There are many ways you can customize the blueprint, including the following:

  • Use on-prem NIMs vs. cloud-hosted models
  • Swap models names or endpoints
  • Adjust default report organization prompts

To customize this blueprint, adjust the config files under configs/ in a code editor and save your changes. Note that config.yml refers to local deployment configurations and hosted-config.yml refers to deployments using NVIDIA-hosted model endpoints.

When restarting the compose, any configuration changes you make will take effect.

License

This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use, found in License-3rd-party.txt.

GOVERNING TERMS: The software and materials are governed by NVIDIA Software License Agreement and Product Specific Terms for AI Product; except as follows: (a) the models, other than the llama-3_3-nemotron-super-49b-v1_5 model, are governed by the NVIDIA Community Model License; (b) the llama-3_3-nemotron-super-49b-v1_5 model is governed by the NVIDIA Open Model License Agreement, and (c) the NeMo Retriever extraction is released under the Apache-2.0 license.

ADDITIONAL INFORMATION: For NVIDIA Retrieval QA Llama 3.2 1B Reranking v2 model, NeMo Retriever Graphic Elements v1 model, and NVIDIA Retrieval QA Llama 3.2 1B Embedding v2: Llama 3.2 Community License Agreement, Built with Llama. For Llama-3.3-70b-Instruct model, Llama 3.3 Community License Agreement, Built with Llama.