DugBot is an advanced AI-powered research assistant designed to help researchers discover and understand biomedical studies through natural language queries. Built on LangGraph and LangChain frameworks, it provides intelligent search capabilities over the NHLBI-BioData Catalyst research studies using both vector-based similarity search and knowledge graph traversal.
DugBot employs Retrieval Augmented Generation (RAG) to provide accurate, grounded responses about biomedical studies, variables, and their relationships. The system processes 200+ studies from the BioData Catalyst database and enables researchers to find relevant information through conversational queries.
- Intent Agent: Analyzes user queries and extracts intent/preferences
- Supervisor Agent: Routes queries to appropriate lookup mechanisms
- KG Lookup Agent: Retrieves information through knowledge graph traversal
- QV Lookup Agent: Performs vector-based similarity search
- Vector-Based Retrieval: Semantic similarity search over pre-generated questions
- Knowledge Graph Retrieval: Concept-based graph traversal for entity relationships

- Python 3.8+
- Docker (optional, for containerized deployment)
- Ollama (for local LLM inference)
- Qdrant (vector database)
- Redis (knowledge graph storage)
The system provides multiple server endpoints for different use cases:
- Agent Server (Multi-agent routing)
python src/agent_server.py
# Serves on port 8099
- Knowledge Graph Server (KG-only queries)
python src/kg_app_server.py
# Serves on port 8094
- Combined QVKG Server (Vector + KG)
python src/qvkg_app_server.py
# Serves on port 8005
POST /agent/invoke
- Full agent routing with intent analysisPOST /agent/stream
- Streaming responsesGET /agent/score/{trace_id}/{score}
- Feedback scoring
POST /kg-app/invoke
- Knowledge graph-based queriesPOST /kg-app/stream
- Streaming KG responses
POST /qvkg-app/invoke
- Combined vector + KG queriesPOST /qvkg-app/stream
- Streaming combined responses
The system includes tools for generating training questions from study abstracts:
# Generate questions from study abstracts
python src/core.py
# Create embeddings for question-answer pairs
python db_builder/create_embeddings.py
# Load data into Qdrant
python db_builder/qdrant_loader.py
Koios-develop/
├── src/
│ ├── agents/ # Multi-agent implementations
│ │ ├── intent_agent_graph.py
│ │ ├── route_agentic_graph.py
│ │ ├── combined_context_graph.py
│ │ └── supervisor.py
│ ├── chains/ # Processing chains
│ │ ├── kg_chain.py
│ │ ├── question_lookup_chain.py
│ │ ├── qvkg_chain.py
│ │ └── user_intent_chain.py
│ ├── models/ # Data models
│ │ ├── agent_state.py
│ │ └── user_question.py
│ ├── databases/ # Database connectors
│ │ ├── qdrant.py
│ │ └── redis_graph.py
│ ├── guardrails/ # Input validation
│ │ └── input_guard.py
│ └── util/ # Utility functions
├── db_builder/ # Database setup tools
├── prompts/ # Prompt templates
├── ragas_benchmark/ # Evaluation framework
└── requirements.txt
This project is licensed under the MIT License