ConvoAI is a modern conversational AI platform featuring local LLM inference (Ollama or Transformers), Retrieval Augmented Generation (RAG), and a premium React chat interface with project-based provider selection.
- Multiple LLM Providers: Choose between local Ollama models or Transformers for inference
- Project-Based Configuration: Per-project settings for providers and models
- Retrieval Augmented Generation (RAG): Automatically retrieves relevant context from your local knowledge base for more accurate and context-aware answers
- Conversation History: Maintains session-based conversation history for contextual interactions
- Real-time Streaming: Smooth, ChatGPT-like typing experience with token-by-token streaming
- Premium Chat UI: Modern, responsive design with project/provider selection dropdowns
- Frontend: React with real-time streaming and project selector
- Backend: FastAPI with RAG service, local LLM integration, and Ollama provider
- LLM: Local Transformers model or Ollama (qwen2.5:3b, etc.)
- Embeddings: Sentence transformers for local embeddings
POST /api/chat: Main endpoint for RAG chat- Body:
{ sessionId: string, message: string, project_id?: string, provider?: string, model?: string } - Response:
{ reply: string, sources?: [{id: string, preview: string}] }
- Body:
POST /api/chat/stream: Streaming endpoint for Ollama responses (Server-Sent Events)GET /api/health: Checks API status
- Node.js (v18+)
- Python (v3.9+)
- Ollama (optional, for Ollama provider)
git clone <repository-url>
cd convoaiCreate a .env file in the root directory:
# Ollama Configuration
OLLAMA_BASE_URL=http://127.0.0.1:11434
OLLAMA_MODEL=qwen2.5:3b
OLLAMA_TIMEOUT=60
# RAG Configuration
ENABLE_RAG=0 # Set to 1 to enable RAG functionality
# Kafka Configuration (disabled by default)
ENABLE_KAFKA=false
# Frontend Configuration
REACT_APP_BACKEND_URL=http://localhost:8000Place any markdown (.md) or text (.txt) files in the /knowledge directory. They will be automatically indexed on backend startup when RAG is enabled.
- Install Ollama from ollama.ai
- Pull a model:
ollama pull qwen2.5:3b
- Start Ollama server:
ollama serve
- Install backend dependencies:
cd backend pip install -r requirements.txt - Start the backend:
uvicorn app.main:app --reload --port 8000
- In a new terminal, install and start the frontend:
cd frontend npm install npm start
The application will be available at http://localhost:3000.
The UI includes dropdowns to select:
- Project: Choose between different project configurations
- Provider: Switch between "Local Ollama" and "Local LLM"
- Model: Specify the model name
ConvoAI is ready for deployment on:
- Frontend: Vercel/Netlify for React build
- Backend: Render/Fly.io for FastAPI application
- Containerized: Use Docker with the provided configurations
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
See CHANGELOG.md for a full list of changes and updates.
- Built with React and FastAPI
- Powered by Ollama for local LLM inference
- Uses Transformers for local model support