A multilingual (Hindi + English) AI assistant designed for rural development governance, enabling Gram Panchayats to access scheme information, analyze village deficits, retrieve government documents, and generate actionable development insights using FastAPI, Qdrant Vector DB, Neo4j Knowledge Graph, LLMs, and a modern React + Tailwind frontend.
Demo Video: https://youtu.be/SvaaQusU9nU?si=S9WfEV2PBO07HTdO
- Overview
- Features
- Folder Structure
- How to Run Locally
- Architecture & Design Decisions
- Data Sources & Preprocessing
- RAG + Knowledge Graph Pipeline
- LLM Reasoning Flow
- Challenges & Trade-Offs
Panchayat-Sahayika is an AI-powered digital assistant designed specifically for rural governance in India, helping Gram Panchayat officials, government workers, and citizens to:
- Understand development deficits
- Discover government schemes
- Ask questions about rural indicators
- Receive data-driven recommendations
- Access extended insights using RAG + LLM reasoning
The system integrates:
- FastAPI backend
- Qdrant vector embeddings for semantic search
- LLMs (Groq / OpenAI) for query interpretation
- CSV datasets of Uttarakhand village-level deficits
- React + Tailwind frontend for clean interaction
Real-world application: We tested the prototype in Pawo Malla village (Uttarakhand) with the Gram Pradhan, gathering real on-ground feedback about water issues, connectivity problems, and village priorities.
Ask in Hindi or English → system automatically interprets meaning.
System analyzes infrastructure deficits (roads, water, health, education).
Users can ask:
“Hamare gaon ke liye paani yojana kaun si hai?”
Retrieves relevant paragraphs, government documents, schemes, and datasets.
Graph nodes:
- Development Themes
- Government Schemes
- Panchayat Needs
- Infrastructure deficits
Relationships enrich LLM outputs.
LLM synthesizes scheme suggestions, cluster insights, and action strategies.
Built with React + Tailwind for lightweight rural-friendly UX.
Panchayat-Sahayika/
│
├── backend/
│ ├── data/ # Scheme texts, rural indicators, reference docs
│ ├── qdrant_data/ # Preprocessed embeddings or vector payloads
│ ├── services/ # RAG, embeddings, KG, query handlers
│ ├── utils/ # Helper functions, preprocessors
│ ├── FinderScreen.py # Infra deficit inference logic
│ ├── gram.py # Panchayat-specific retrieval logic
│ ├── app.py # FastAPI entry
│ ├── main.py # API routing + server startup
│ ├── requirements.txt
│ └── uttarakhand_infra_deficits.csv
│
├── public/
│
├── src/
│ ├── components/ # Chat UI, cards, loaders
│ ├── pages/ # Main dashboard, chat page
│ ├── styles/ # Tailwind configs
│ └── utils/ # Frontend helpers
│
├── index.html
├── package.json
├── tailwind.config.js
└── README.md
cd backend
pip install -r requirements.txtCreate .env:
# --- Qdrant Vector DB ---
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_key
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# --- LLM Keys ---
GROQ_API_KEY=your_groq_api
OPENAI_API_KEY=your_openai_api
# --- (Optional) Model Settings ---
MODEL_NAME=llama-3-8b
TEMPERATURE=0.2
uvicorn main:app --reloadcd frontend
npm install
npm run devSet .env:
VITE_API_URL=http://127.0.0.1:8000
App opens at:
http://localhost:5173
Lightweight, modular, async-ready.
To store embeddings for:
- schemes
- documents
- development indicators
Captures structured relationships between:
- Themes
- Schemes
- Infrastructure deficits
- Gram Panchayat needs
Uses uttarakhand_infra_deficits.csv to compute village-level gaps.
Groq → Fast inference OpenAI → fallback + improved quality
flowchart TD
A[User Query<br>Hindi/English] --> B[Frontend Chat UI]
B --> C[FastAPI Backend]
C --> D[Preprocessing<br>Language Detection, Normalization]
D --> E[Qdrant Semantic Search]
D --> F[Neo4j Knowledge Graph]
D --> G[Uttarakhand Infra Deficit CSV]
E --> H[RAG Context Builder]
F --> H
G --> H
H --> I[LLM Reasoning Layer<br>Groq / OpenAI]
I --> B
File: uttarakhand_infra_deficits.csv
Contains metrics like:
- Water supply status
- Road access
- Healthcare centers
- Digital connectivity
- Education infrastructure
Stored in /backend/data/.
- Text cleaning
- Stopword handling
- Semantic chunking
- Embedding generation
- KG node + edge creation
- Detect language (Hindi/English)
- Identify keywords (water, roads, health)
Returns top N relevant chunks.
Via FinderScreen.py.
LLM merges:
- semantic context
- graph knowledge
- village deficit data
To generate a final actionable recommendation.
User Query → Parse Intent → Retrieve Relevant Schemes → Fetch Village Deficits
→ Expand using Knowledge Graph → LLM synthesis → Final Recommendation
Example:
“Hamare gaon me paani ki dikkat hai. Kya sujhav hai?”
LLM Output includes:
- identified deficits from CSV
- related schemes like Jal Jeevan Mission
- local insights
- actionable steps
Solution: Hand-curated CSV for Uttarakhand.
Trade-off: Store compact MiniLM embeddings.
Trade-off: Skip KG rebuild on every startup to save memory.
Trade-off: Simple rules + embeddings instead of a dedicated NLU model.
Solution: fuzzy matching + manual correction list.