RentRadar

RentRadar is an AI-assisted rental inspection and decision-support system. It combines live scan guidance, image analysis, location intelligence, pre-lease advice, multi-property comparison, approximate 3D room views, exportable reports, and an autonomous server-ops agent in one monorepo.

Quick Start — Try it now: https://170-64-160-8.sslip.io/

System Architecture

flowchart TB
    subgraph Client["🖥️ Frontend (Next.js 16 + React 19)"]
        Camera["📷 Camera Stream"]
        Upload["📤 Manual Upload"]
        UI["🎨 UI Components<br/>(Tailwind v4 + shadcn/ui)"]
        Store["💾 IndexedDB + Zustand"]
    end

    subgraph API["⚙️ API Layer (Next.js Route Handlers)"]
        direction TB
        RateLimit["🚦 Rate Limiter<br/>(per-endpoint sliding window)"]
        
        subgraph Agents["🤖 Multi-Agent System"]
            Vision["👁️ Vision Agent<br/>(hazard detection)"]
            Geo["🌍 Geo Analyzer<br/>(Maps + Geocoding)"]
            Community["🏘️ Community<br/>Research Agent"]
            Agency["🏢 Agency<br/>Background Agent"]
            MapsIntel["🗺️ Maps-Grounded<br/>Intelligence"]
        end

        subgraph SmartGW["🔀 Smart Gateway"]
            Flash["⚡ Gemini 2.5 Flash"]
            Pro["🧠 Gemini 2.5 Pro"]
            Flash -->|"_escalateToPro: true"| Pro
        end

        subgraph RAG["📚 Knowledge Base RAG"]
            Chunk["📄 420-char Chunking"]
            Embed["🔢 Cohere Embed v3"]
            Qdrant["🗄️ Qdrant Vector DB"]
            Rerank["📊 Cohere Rerank"]
            Chunk --> Embed --> Qdrant --> Rerank
        end

        TTS["🔊 MiniMax TTS"]
        ThreeD["🏠 3D Reconstruction"]
    end

    subgraph External["☁️ External Services"]
        Gemini["Google Gemini API"]
        Maps["Google Maps Platform"]
        Cohere["Cohere API"]
        MiniMax["MiniMax API"]
        DOSpaces["DigitalOcean Spaces"]
    end

    subgraph Ops["🤖 Autonomous Ops Agent (Python)"]
        Monitor["📊 System Monitor"]
        Diagnose["🔍 Diagnose & Classify"]
        Heal["🔧 Self-Heal"]
        Learn["📝 Learn to ES"]
        Monitor --> Diagnose --> Heal --> Learn
    end

    Camera --> Vision
    Upload --> Vision
    UI --> RateLimit
    RateLimit --> Agents
    RateLimit --> SmartGW
    RateLimit --> RAG
    RateLimit --> TTS
    RateLimit --> ThreeD
    
    Vision --> SmartGW
    Geo --> Maps
    Community --> SmartGW
    Agency --> SmartGW
    MapsIntel --> Maps
    MapsIntel --> SmartGW
    RAG --> Cohere
    TTS --> MiniMax
    SmartGW --> Gemini
    ThreeD --> SmartGW
    
    Agents -->|"Promise.allSettled"| UI
    UI --> Store

1. Features

Live Inspection — real-time camera scanning with AI-guided targets, hazard re-inspection, and MiniMax voice alerts
Manual Upload — batch photo analysis with automatic hazard detection
Report Center — risk scoring, geo/community/agency intelligence, evidence summary, pre-lease advice, 3D room view
Multi-Property Compare — weighted scoring across budget, commute, noise, lighting, condition, agency, and community
History — local IndexedDB persistence of past searches and comparisons
Knowledge Base — RAG-enhanced rental advice powered by Cohere + Qdrant
Smart Gateway — dynamic model routing that lets Gemini Flash automatically escalate overwhelmingly complex tasks (like rigorous math proofs or deep logical reasoning) to Gemini Pro invisibly
Autonomous Server Ops Agent — a continuously running AI workflow that monitors, diagnoses, and self-heals the production server (see Section 12)

2. Tech Stack

Frontend

Next.js 16.1.6, React 19, TypeScript
Tailwind CSS v4, shadcn/ui, Framer Motion
Zustand, IndexedDB (idb), Recharts
@vis.gl/react-google-maps, Three.js
html2canvas + jsPDF

Backend / Server

Next.js Route Handlers, Zod
@google/genai (Gemini 2.5 Flash / Pro)
Jimp, DigitalOcean Spaces (S3-compatible presigned upload)
MiniMax TTS
Google Maps Platform (Geocoding, Places, Routes, Static Maps, Maps JS)

Shared Packages

packages/contracts — Zod schemas, shared types
packages/ui — shared UI components

Ops Agent (Python)

Python 3.10+, Elasticsearch 8, OpenAI-compatible LLM
YAML workflow engine, systemd / Docker deployment

3. Repository Structure

Inspect/
├── apps/
│   ├── web/                    # Frontend (user-facing UI)
│   └── api/                    # API (server-side routes)
├── packages/
│   ├── contracts/              # Shared schemas / types
│   └── ui/                     # Shared UI components
├── agentic-workflow/           # Autonomous server ops agent (Python)
│   ├── src/agentic_workflow_agent/
│   ├── workflows/              # YAML workflow definitions
│   ├── deploy/                 # systemd + VM setup scripts
│   ├── Dockerfile
│   └── docker-compose.yml
├── tests/                      # Vitest / Playwright
├── package.json                # Monorepo root scripts
├── pnpm-workspace.yaml
└── README.md

4. Pages

Route	Description
`/`	Home page — Live / Manual entry points
`/radar`	Live scan preparation and status
`/scan`	Camera scanning, guided re-inspection, 3D Scan Studio
`/manual`	Photo upload and analysis
`/report/[id]`	Inspection report
`/compare`	Multi-property comparison entry
`/compare/[id]`	Comparison report details
`/history`	Search and comparison history

5. API Endpoints

Method	Path	Description
GET	`/api/health`	Health check
POST	`/api/upload/sign`	Presigned upload URL
POST	`/api/storage/object`	Object storage
POST	`/api/analyze`	Image analysis
POST	`/api/analyze/live`	Live frame analysis
POST	`/api/intelligence`	Location intelligence
POST	`/api/negotiate`	Lease negotiation advice
POST	`/api/knowledge/query`	Knowledge base RAG query
POST	`/api/compare`	Multi-property comparison
POST	`/api/geocode/reverse`	Reverse geocoding
POST	`/api/checklist/prefill`	Checklist auto-fill
POST	`/api/listing/discover`	Listing discovery
POST	`/api/listing/extract`	Listing extraction
POST	`/api/maps/static`	Static map generation
POST	`/api/assets/sign-get`	Asset access signing
POST	`/api/tts/alert`	Voice alert synthesis
POST	`/api/scan/3d/reconstruct`	3D room reconstruction

6. Getting Started

Prerequisites

Node.js >= 20, pnpm >= 9
macOS / Linux / Windows

Install and Run

pnpm install
cp .env.example .env.local
pnpm dev

Or start frontend and API separately:

pnpm dev:web   # http://localhost:3000
pnpm dev:api   # http://localhost:3001

Build and Start

pnpm build
pnpm start

7. Environment Variables

Required

GEMINI_API_KEY=
GOOGLE_MAPS_API_KEY=
NEXT_PUBLIC_GOOGLE_MAPS_API_KEY=

Gemini Models

GEMINI_VISION_MODEL=gemini-2.5-flash
GEMINI_LIVE_MODEL=gemini-2.5-flash
GEMINI_SCENE_EXTRACT_MODEL=gemini-2.5-flash
GEMINI_SCENE_SYNTHESIS_MODEL=gemini-2.5-pro
GEMINI_GROUNDED_MODEL=gemini-2.5-flash
GEMINI_INTELLIGENCE_MODEL=gemini-2.5-flash-lite
GEMINI_REASONING_MODEL=gemini-2.5-pro

MiniMax TTS

MINIMAX_API_KEY=
MINIMAX_API_BASE=https://api.minimax.io
MINIMAX_TTS_MODEL=speech-2.8-hd
MINIMAX_TTS_VOICE_ID=English_expressive_narrator
MINIMAX_TTS_FORMAT=mp3

Frontend Public

NEXT_PUBLIC_API_BASE_URL=http://localhost:3001
NEXT_PUBLIC_ENABLE_DEMO_MODE=false

DigitalOcean Spaces (optional, needed for uploads)

DO_SPACES_REGION=
DO_SPACES_BUCKET=
DO_SPACES_ENDPOINT=
DO_SPACES_KEY=
DO_SPACES_SECRET=

CORS and Deployment

DEPLOY_TARGET=              # local | api | frontend
CORS_ALLOWED_ORIGINS=       # comma-separated origins

8. Core Workflows

8.1 Live Scan Workflow

Camera Start → Select Room Type → Begin Scan
                                       │
                                  Vision Engine
                                  Analysis Loop
                                       │
                    ┌──────────────────┼──────────────────┐
                    ▼                  ▼                  ▼
              AI Analysis        MiniMax TTS        Target Guidance
              /analyze/live      Voice Alerts       Visual Prompts
                    │                  │                  │
                    └──────────────────┼──────────────────┘
                                       ▼
                                 Re-inspection
                                 (high risk)
                                       ▼
                                 End Scan →
                                 Generate Report

Key components:

useCameraStream.ts — camera capture and frame extraction
useVisionEngine.ts — vision analysis engine (60 req/min rate limit)
liveGuidance.ts — guided target system (predefined sequences per room type)
liveRoomState.ts — room scan state machine

Room verdict logic: pass | caution | fail | insufficient-evidence

8.2 Report Generation Workflow

Scan Complete → Build Snapshot → Save to IndexedDB → Navigate to Report
                                                          │
                                     Progressive Enhancement Loading
                                          │         │         │         │
                                        Geo     Community   Agency   Decision
                                       /intel    /intel     /intel   /negotiate
                                          │         │         │         │
                                          └─────────┴─────────┘
                                                    │
                                             Knowledge Base
                                             /knowledge/query

Features: progressive enhancement, graceful degradation per module, normalizeReportSnapshot() for data integrity.

8.3 Intelligence Gathering (Parallel Multi-Agent)

const [geoResult, groundedResult, communityResult, agencyResult] =
  await Promise.allSettled([
    analyzeGeoContext({ address, coordinates, targetDestinations, depth }),
    summarizeMapsGroundedIntelligence({ address, coordinates, agency, depth }),
    researchCommunity({ address, coordinates, propertyNotes, depth }),
    analyzeAgencyBackground({ agency, depth }),
  ]);

Agent	Responsibility	Data Sources
`geoAnalyzer.ts`	Geography analysis	Google Maps Geocoding, Places, Routes
`searchAgent.ts`	Agency background	Tavily Search, Gemini Grounded
`communityResearchAgent.ts`	Community research	Google Search, Gemini
`mapsGroundedIntelligence.ts`	Map fusion	Google Maps + Gemini

Multi-source fusion detects conflicts (e.g., map says convenient transit but web evidence shows noise issues) and reports them with balanced perspective.

8.4 Knowledge Base RAG

Query → Cohere Embedding → Qdrant Vector DB → Rerank (optional) → Top-K → Gemini Generate

Document chunking: 420-char sliding window with 80-char overlap
Embedding: Cohere embed-english-v3
Vector DB: Qdrant (local or remote)
Retrieval: Dense + optional rerank
Fallback: keyword matching when RAG is unavailable

8.5 Comparison Workflow

Inputs: 2–5 candidate reports, factor weights (budget, commute, noise, lighting, condition, agency, community), preference profile.

Outputs: ranked candidates, winning reasons, trade-off analysis, knowledge base matches, document checklist.

8.6 3D Room Reconstruction (AI-driven, no LiDAR)

3–8 Room Photos → Per-image Analysis (Gemini) → Multi-view Fusion → Scene Synthesis (Gemini Pro)

Produces: approximate dimensions, openings (doors, windows, balconies), furniture layout, and hazard markers.

8.7 Checklist Prefill

Fields classified as remote-friendly (e.g., security.nightEntryRoute, noise.lateNight) are auto-filled from intelligence; manual-priority fields (e.g., utilities.hotWater, security.doorLocks) are flagged for on-site verification.

8.8 Listing Discovery

Discover API: address → candidate listing URLs (12 req / 2 min)
Extract API: listing URL → details (title, summary, rent, features, checklist tips; 10 req / 2 min)

9. Deployment Architecture

Hybrid Strategy

Application layer: PM2 manages Node.js processes (non-containerized)
Vector database: Docker runs Qdrant (only containerized component)
Ops agent: systemd service (Python, runs independently)

Qdrant Docker

docker run -d \
  --name qdrant \
  --restart unless-stopped \
  -p 127.0.0.1:6333:6333 \
  -v /opt/inspect-ai/qdrant_storage:/qdrant/storage \
  qdrant/qdrant:latest

VPS Architecture

┌──────────────────────────────────────────────────────────────┐
│                          VPS Server                          │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  Nginx (80/443) ──── PM2 ──── inspect-web (:3000)           │
│                         └──── inspect-api (:3001)            │
│                                     │                        │
│                               Docker Qdrant (:6333)          │
│                                                              │
│  systemd ──── agentic-workflow (Python, loop mode)           │
│               Auto health checks every hour                  │
│               Self-healing + learning to Elasticsearch       │
│                                                              │
└──────────────────────────────────────────────────────────────┘

Deployment Options

Option	Frontend	API	Vector DB	Ops Agent	Best For
A — easiest	Vercel	Vercel	None	None	Quick start, no RAG
B — balanced	Vercel	Render/Railway	Managed Qdrant	None	Medium scale
C — full control	VPS	VPS	Docker Qdrant	systemd	Full features

10. AI Architecture

Multi-Model Coordination

Task	Primary Model	Fallback	Rationale
Image analysis	Gemini 2.5 Flash	—	Fast, cheap, multimodal
Geo intelligence	Gemini + Google Maps	Web search	Grounding-enhanced
Community research	Gemini 2.5 Flash	Search grounding	Multi-pass search
Agency background	Gemini 2.5 Flash	Search grounding	Multi-pass search
Knowledge Base RAG	Cohere Embed	Cohere Rerank	Specialized embedding/ranking
Smart Gateway (Routing)	Gemini 2.5 Flash	Gemini 2.5 Pro	Flash acts as an evaluator and automatically escalates strict JSON schemas to Pro if a prompt is too difficult to answer directly
Answer generation	Gemini 2.5 Flash	Local fallback	Cost/quality balance
Voice synthesis	MiniMax TTS	—	English expressive narration
Server ops	GLM-5 via apiyi.com	Retry with backoff	Tool-calling capable

Core Gateway Mechanics

The Smart Gateway relies on a robust schema-wrapping mechanism instead of native tool calling (which conflicted with strict JSON parsing).

We inject an optional _escalateToPro: boolean into the requested target Zod schema.
The gemini-2.5-flash model evaluates if the query is too complex (e.g. requires advanced multi-step proofs).
If it is complex, it outputs only the _escalateToPro: true flag natively as JSON.
Our AI interceptor reads this raw JSON flag natively and seamlessly forwards the exact prompt to gemini-2.5-pro (GEMINI_REASONING_MODEL), without ever breaking strict type validation.

Prompt Engineering

Structured output: callGeminiJson() enforces JSON via Zod schema + responseJsonSchema
Role definition: "tenant-visible risks" bounds the analysis scope
Constraint injection: character limits, category enums, dynamic room type context

Error Handling (4-Layer Degradation)

Layer	Scope	Strategy
1 — Request	Route handler	Schema validation fail → empty result; rate limit → 429 + Retry-After
2 — Service	Agent	API timeout → fallback data; no search results → local KB
3 — Model	AI call	Gemini fail → `fallbackReason`; `withTimeout()` retry
4 — Data	Fallback builder	Generate default/prompt content; keep UI usable

Cost Optimization

Scenario	Model	Cost
Vision analysis	Gemini 2.5 Flash	$
Simple intelligence	Gemini 2.5 Flash-lite	$
Complex reasoning	Gemini 2.5 Pro	$$
Embedding	Cohere embed-v4.0	$
Reranking	Cohere rerank-v4.0-pro	$$

Caching: in-memory knowledge docs, Gemini client singleton, search result scoring and filtering.

11. Type Safety and Data Architecture

Contracts Package

// packages/contracts/src/schemas.ts
export const HazardSchema = z.object({
  id: z.string(),
  type: z.enum(["structural", "electrical", "plumbing", "environmental"]),
  severity: z.enum(["low", "medium", "high", "critical"]),
  description: z.string(),
  evidence: z.array(z.string()),
});

Offline-First

IndexedDB stores report snapshots
Zustand + persist for state persistence
Session recovery on page refresh

Rate Limits

/api/analyze/live — 60 req / min
/api/listing/discover — 12 req / 2 min
/api/listing/extract — 10 req / 2 min

12. Autonomous Server Ops Agent

The agentic-workflow/ directory contains a standalone Python agent that autonomously monitors, diagnoses, and remediates the production server. Once deployed, it starts working immediately with no human intervention.

Architecture

systemd / Docker (loop mode)
  → entrypoint.sh
      → YAML Workflow Runner
          → Agent Loop (plan → tool → observe → answer)
              → execute_bash_command    (real server commands)
              → fetch_system_logs       (journalctl error logs)
              → search_knowledge_base   (Elasticsearch RAG)
              → search_web              (DuckDuckGo fallback)
              → learn_resolution        (write fixes back to ES)
              → invoke_elastic_agent    (optional Kibana sub-agent)
          → Structured Ops Report

How It Works

The default workflow (workflows/ubuntu_auto_ops.yaml) implements a multi-tiered strategy:

Gather symptoms — fetch_system_logs, execute_bash_command (top, df -h, free -m, etc.)
Classify severity:
- Tier 1 (simple): disk full, service stopped, memory leak → auto-remediate immediately
- Tier 2 (complex): kernel panic, unknown tracebacks → search KB first, then web, then remediate
Self-heal — executes fix commands (apt-get clean, systemctl restart, firewall rules, etc.)
Self-learn — writes successful resolutions back to Elasticsearch via learn_resolution
Report — structured ops report with diagnosis, actions, and recommendations

In loop mode this cycle repeats every hour (configurable). Failed iterations retry with exponential backoff.

Real-World Example

On first deployment to the production VPS, the agent autonomously:

Detected SSH brute-force attacks (330+ attempts from a single IP)
Installed and configured fail2ban (24h ban for SSH brute force)
Enabled UFW firewall (allow 22/80/443/3000/3001 only)
Hardened SSH (PermitRootLogin prohibit-password)

Why This Is Not Ordinary RAG

Feature	Ordinary RAG	This Agent
Retrieval	Fixed, one-shot	Dynamic, multi-round, agent-decided
Decision	None	Plan → tool → observe → decide again
Actions	Read-only	Executes real bash commands
Learning	None	Writes resolutions back to ES
Sub-agents	None	Can delegate to Kibana Agent Builder

Ops Agent Deployment

systemd (production VPS):

cd agentic-workflow
bash deploy/setup-vm.sh
sudo nano /opt/agentic-workflow/.env
sudo systemctl start agentic-workflow
sudo systemctl enable agentic-workflow
sudo journalctl -u agentic-workflow -f

Docker Compose:

cd agentic-workflow
cp .env.example .env
docker compose up -d

Ops Agent Environment Variables

Variable	Default	Description
`ELASTIC_URL`	—	Elasticsearch endpoint
`ELASTIC_API_KEY`	—	Elasticsearch API key
`OPENAI_API_KEY`	—	OpenAI-compatible API key
`OPENAI_BASE_URL`	—	Custom LLM base URL
`OPENAI_CHAT_MODEL`	`glm-5`	Chat model name
`LLM_REQUEST_TIMEOUT`	`120`	Request timeout (seconds)
`LLM_MAX_RETRIES`	`3`	Retry count on transient errors
`RUN_MODE`	`workflow`	`loop` / `workflow` / `ask` / `chat`
`LOOP_INTERVAL_SECONDS`	`3600`	Interval between loop iterations
`BOOTSTRAP_ON_START`	`true`	Create ES indices on startup

Resilience

LLM timeouts: retries with exponential backoff (2s, 4s, 8s …)
Agent loop errors: caught and returned as error report, never crashes the process
Workflow failures: short retry delay (60s × failure count, capped at 5 min) instead of full interval
Process crashes: systemd Restart=on-failure / Docker restart: unless-stopped

13. Testing

Unit Tests

pnpm test:unit

Vitest: utility functions, store logic, type conversions.

E2E Tests

pnpm test:e2e

Playwright: full user flows, cross-page state, responsive layout.

14. Security Best Practices

All API keys stored server-side in .env.local
Frontend uses only NEXT_PUBLIC_ prefixed public config
CORS whitelist restricts cross-origin requests
Uploads use presigned URLs — no key exposure
Input validation via Zod schemas
Type-safe outputs throughout
Production server hardened by the autonomous ops agent (fail2ban, UFW, SSH)

15. Development Guide

Adding a New Page

Create directory under apps/web/src/app/
Add page.tsx and optional loading.tsx
Use useSessionStore for state management
Add route to next.config.ts headers config

Adding a New API

Create directory under apps/api/src/app/api/
Add route.ts with HTTP method handlers
Use ensureCrossOriginAllowed for CORS
Validate input with Zod schemas
Add types to packages/contracts

Adding a New Agent

Create file in apps/api/src/lib/agents/
Export a run function accepting context parameters
Use callGemini or callGeminiJson for model calls
Return structured results

16. Engineering Deep-Dive: What We Actually Built and Why

This section explains the hard technical decisions, failure modes we handled, and measured performance — not just what features exist, but why they work the way they do.

16.1 Why Multi-Model Routing (Smart Gateway)

Problem: Gemini 2.5 Flash is fast and cheap but occasionally produces shallow or incorrect answers on complex reasoning tasks (e.g., multi-step risk analysis, legal clause interpretation). Gemini 2.5 Pro is more capable but 3–5× slower and more expensive.

Why not just use Pro everywhere? Cost and latency. A single live scan session fires ~60 vision calls/min. At Pro pricing, this becomes economically unsustainable. Flash handles 95%+ of queries adequately.

Our solution — Schema-Wrapping Gateway:

We inject an optional _escalateToPro: boolean into every Zod schema sent to Flash.
Flash evaluates its own confidence. If overwhelmed, it sets the flag instead of guessing.
Our interceptor detects this via raw JSON.parse (not Zod — to avoid validation crashes on incomplete schemas) and transparently re-routes to Pro.
The escalation path gets a 1.5× timeout budget (30s vs 20s default) to accommodate Pro's longer thinking time.

Why not use Gemini's native tool calling for this? We tried. The Gemini API throws ApiError: Function calling with a response mime type: 'application/json' is unsupported. Tool calling and strict JSON mode are mutually exclusive. Schema wrapping bypasses this entirely.

Hard lesson learned: The zod-to-json-schema library silently outputs {} in monorepo environments due to multiple Zod instanceof chains. We wrote a custom createGeminiSchema() mapper using stable constructor.name lookups to guarantee correct schema translation across all deployment targets.

16.2 Why This RAG Architecture

Problem: Tenants need actionable rental advice grounded in Australian tenancy law and best practices, but Gemini hallucinates legal advice when unconstrained.

Why Cohere + Qdrant instead of just prompting Gemini?

Gemini has no guaranteed access to niche Australian rental law documents.
RAG lets us control exactly which knowledge the model can cite — no hallucinated legal references.
Cohere's embed-english-v3 + rerank-v4.0-pro consistently outperformed Gemini's own embedding on our domain-specific content in informal testing.

Pipeline details:

Chunking: 420-char sliding window with 80-char overlap, sentence-boundary-aware splitting (not naive character slicing).
Retrieval: Dense vector search via Qdrant, top-12 candidates → Cohere rerank → top-K (configurable, default 5).
Generation: Gemini Flash with strict knowledgeAnswerSchema enforcement (summary ≤180 chars, 2–4 key points ≤120 chars each, confidence rating).

Fallback chain (3 layers):

RAG runtime missing (no Qdrant/Cohere keys) → falls back to keyword-based local search over cached knowledge docs.
Rerank fails → uses raw retrieval scores, continues pipeline.
Answer generation fails → returns pre-built fallback answer from matched snippets with confidence: "low".

16.3 Handling Structured Output Failures

Every AI call goes through callGeminiJson() which enforces strict responseMimeType: "application/json" + responseJsonSchema. But models still fail:

Failure Mode	How We Handle It	Where
Model returns empty text	`throw Error("empty response")` → caught by caller, returns `fallbackReason`	`ai.ts:L69-71`
JSON doesn't match Zod schema	`schema.parse()` throws → caller catches, returns degraded result	Every agent
Model times out	`withTimeout()` wrapper rejects after deadline → caller returns fallback	All AI calls
Gateway escalation JSON incomplete	Native `JSON.parse` + property check (not Zod) avoids crash	`ai.ts:L108-111`
Vision analysis fails entirely	Returns `{ hazards: [], fallbackReason: "gemini_analyze_failed" }`	`geminiService.ts:L103-111`

Design principle: No single AI failure should crash the request. Every agent function returns a typed result with an optional fallbackReason field, letting the UI render partial data with appropriate caveats.

16.4 Rate Limiting and Latency Budgets

Server-side rate limits (per-endpoint, in-memory sliding window):

Endpoint	Rate Limit	Timeout Budget
`/api/analyze/live` (live scan)	60 req / 60s	25s (vision)
`/api/analyze` (manual upload)	45 req / 60s	25s
`/api/intelligence`	12 req / 60s	10–18s (parallel agents)
`/api/negotiate`	8 req / 60s	8s
`/api/knowledge/query`	30 req / 60s	9s (RAG generation)
`/api/listing/discover`	12 req / 120s	7s
`/api/listing/extract`	10 req / 120s	8–12s
`/api/compare`	12 req / 60s	—
`/api/tts/alert`	20 req / 60s	10s
`/api/maps/static`	18 req / 60s	10s
`/api/scan/3d/reconstruct`	12 req / 60s	8–14s

All rate-limited endpoints return 429 + Retry-After when exhausted. Smart Gateway escalation adds 1.5× to the base timeout for Pro calls.

16.5 Hazard Detection: False Positive / False Negative Handling

The core challenge: Vision models over-detect (false positives) and occasionally miss subtle issues (false negatives).

Strategies implemented:

Severity gating: Only Critical and High severity observations trigger automatic recording during live scan. Medium and Low are displayed as guidance but not persisted without user confirmation.
Bounding-box IoU confirmation: Live observations must appear in ≥2 consecutive focused frames with IoU (Intersection-over-Union) overlap ≥ threshold before being confirmed as a real hazard. This eliminates transient false positives from motion blur or lighting changes.
Multi-image deduplication: Manual upload mode runs dedupeHazards() across all photos to merge duplicate findings (e.g., the same crack photographed from two angles).
Constraint injection: Prompts explicitly state: "Detect visible issues only. Do not infer hidden problems without image evidence." and "Do not mention image quality, model uncertainty, coordinates, or technical scanning terms." This reduces speculative false positives.
4-tier severity system: Critical > High > Medium > Low, each with weighted penalty scores for the overall risk scoring algorithm.

16.6 Hazard Detection Evaluation (First Run)

We ran the full vision pipeline (callGeminiJson → hazardDraftsArraySchema) against 19 local test images across 5 inspection sets (living room, bathroom, kitchen, bedroom, laundry).

Metric	Value
Model	Gemini 2.5 Flash
Test images	19 (across 5 sets of 3–4 photos each)
False positives	0 — model did not hallucinate any hazards on clean properties
Avg latency per set	6.1s (3–4 images per call)
Min / Max latency	4.0s / 8.6s

Key finding: The model has high precision (zero false positives) on well-maintained properties. It correctly identifies clean rooms as hazard-free rather than fabricating issues. This is by design — the prompt explicitly instructs: "Detect visible issues only. Do not infer hidden problems without image evidence."

Limitation: This first evaluation ran against clean, well-maintained rental photos. A comprehensive recall evaluation requires a labelled dataset with known defects (mould, cracking, exposed wiring, pest evidence). This is planned as future work (see Section 16.10).

Evaluation script: apps/api/eval-hazard.ts — reproducible with pnpm dlx tsx --env-file=../../.env.local eval-hazard.ts

16.7 Observed Performance (Informal Benchmarks)

These are real-world observations from development and production testing, not formal benchmarks with statistical rigor.

Metric	Observed Value	Notes
Single image analysis (Flash)	2–4s	1 image, manual upload path
Multi-image analysis (4 photos)	4–8s	Parallel base64 fetch + single model call
Live frame analysis	1.5–3s	Optimized prompt, single frame
Intelligence report (4 agents parallel)	6–12s	`Promise.allSettled` across geo/community/agency/search
Full report generation	8–15s	Progressive enhancement, modules load independently
Knowledge base RAG query	1.5–3s	Embed + Qdrant search + rerank + generation
Smart Gateway escalation overhead	+3–8s	Pro model thinking time on complex queries
3D room reconstruction	10–20s	3–8 photos → per-image analysis → multi-view fusion → scene synthesis

16.8 Test Coverage

Layer	Test Files	Modules Covered
Unit (Vitest)	19	Scoring, checklist prefill, live guidance, live room state, live scan, location, history store, report snapshots, 3D room scenes, room hazards, knowledge query, search relevance, comparison, report display, config, page render
E2E (Playwright)	3	Demo smoke, manual upload smoke, comparison smoke
Total	22	Across `apps/web`, `apps/api`, `packages/contracts`, `tests/e2e`

Modules with deepest unit coverage: scoring.ts (weighted penalty calculation, verdict derivation), liveScan.ts (IoU computation, focus confirmation, alert key deduplication), liveRoomState.ts (room state machine transitions).

16.9 What Would Improve with More Time

Defect recall evaluation: Run a labeled dataset of 200+ rental photos with known defects through the hazard detector and compute per-category recall. Our first evaluation (Section 16.6) confirms high precision on clean properties; comprehensive recall testing requires photos with visible damage.
A/B testing the Smart Gateway threshold: The _escalateToPro decision is currently model-subjective. A calibration dataset would let us measure escalation accuracy (when Flash escalated but could have answered correctly = unnecessary cost; when Flash didn't escalate but should have = quality loss).
Load testing: Verify rate limit behavior under concurrent users. Current limits are based on Gemini API quotas, not empirical server capacity.
RAG retrieval quality metrics: Compute MRR@5 and NDCG@5 on a query set against the knowledge base to validate chunk size and overlap parameters.

17. SafeOps Execution Framework

Anti-hallucination security layer — prevents the LLM from causing real damage by enforcing policy gates, dry-run simulation, and self-verification before every destructive action.

State Machine

Every bash command follows this execution flow:

PROPOSED → CLASSIFIED → DRY_RUN → SELF_VERIFIED → EXECUTING → POST_CHECK → COMPLETED / ROLLED_BACK

At any stage, a command can be REJECTED with a structured reason.

3-Tier Permission System

Level	Examples	Behavior
READ_ONLY	`df -h`, `cat`, `journalctl`, `systemctl status`	Execute immediately, no gate
MODIFY	`systemctl restart`, `apt install`, `ufw allow`	Dry-run → LLM self-verification → execute
DANGEROUS	`rm -rf /`, `dd`, `mkfs`, `reboot`, fork bombs	Automatically BLOCKED

Key Features

Command Whitelist / Blacklist — 40+ read-only prefixes, 20+ modify prefixes, 18 blacklist regex patterns
Dry-Run Simulation — generates human-readable impact descriptions before execution
LLM Self-Verification — model must explicitly confirm YES before any state-changing command
Auto-Rollback Registry — 9 rollback patterns (e.g., systemctl stop X → systemctl start X). If post-execution health check fails, undo is automatic
Structured Audit Log — every operation (proposed / approved / executed / rolled back) produces a JSON audit entry with timestamp, permission level, dry-run result, and execution output
Unknown Command Protection — any unrecognized command defaults to DANGEROUS and is blocked

Test Coverage

35 unit tests covering command classification (30 parameterized cases), rollback derivation, dry-run descriptions, gate state machine integration, audit logging, and verification prompt building.

Source: agentic-workflow/src/agentic_workflow_agent/agent/safe_ops.py

18. Retrieval Planner for Rental Intelligence

Multi-strategy RAG — decomposes free-form queries into typed sub-questions, routes each to an optimal retrieval strategy, executes in parallel, and fuses the results.

How It Works

User query → Gemini Query Decomposer → 1–5 typed sub-questions
  ├── defect       → KB RAG (top_k=5, rerank=on)
  ├── regulation   → KB RAG (top_k=3, tag-filtered: regulation/legal)
  ├── neighborhood → KB RAG (top_k=4, tag-boosted: noise/safety/location)
  └── agency       → KB RAG (top_k=3, tag-filtered: agency/landlord)
         ↓ Promise.allSettled (parallel)
Result Fusion → deduplicate matches → Gemini synthesis → unified answer

Key Design Decisions

Query decomposition via Gemini — a single complex query like "cracked walls, noisy area, unreliable agent?" is split into 3 independent retrieval tasks, each with optimal parameters
Category-specific strategies — defect queries use high top_k with rerank for comprehensive coverage; regulation queries use strict tag filtering for precision
Graceful degradation — if Gemini decomposition fails, falls back to single-query mode; if RAG fails, falls back to local keyword search
Conflict detection — the fusion layer identifies contradictions across sub-question answers

API

POST /api/knowledge/plan — rate-limited at 20 req/min.

Note

Due to hardware limitations of the free-tier cloud service, the RAG pipeline (embedding → vector search → rerank → generation) may load slower than expected on first invocation. Subsequent requests are significantly faster thanks to warm caching.

Source: apps/api/src/lib/knowledge/retrievalPlanner.ts

19. Project Timeline

Date	Milestone
2026-03-13	Project initialized — monorepo, apps, packages, agentic workflow agent
2026-03-14	VPS deployment, knowledge base, security hardening
2026-03-15	Autonomous ops agent deployed to production; first auto-remediation (fail2ban + UFW + SSH hardening); Smart Gateway implemented

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
agentic-workflow		agentic-workflow
apps		apps
image		image
packages		packages
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOY.md		DEPLOY.md
README.md		README.md
eslint.config.mjs		eslint.config.mjs
package.json		package.json
playwright.config.ts		playwright.config.ts
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
security_best_practices_report.md		security_best_practices_report.md
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

RentRadar

System Architecture

1. Features

2. Tech Stack

Frontend

Backend / Server

Shared Packages

Ops Agent (Python)

3. Repository Structure

4. Pages

5. API Endpoints

6. Getting Started

Prerequisites

Install and Run

Build and Start

7. Environment Variables

Required

Gemini Models

MiniMax TTS

Frontend Public

DigitalOcean Spaces (optional, needed for uploads)

CORS and Deployment

8. Core Workflows

8.1 Live Scan Workflow

8.2 Report Generation Workflow

8.3 Intelligence Gathering (Parallel Multi-Agent)

8.4 Knowledge Base RAG

8.5 Comparison Workflow

8.6 3D Room Reconstruction (AI-driven, no LiDAR)

8.7 Checklist Prefill

8.8 Listing Discovery

9. Deployment Architecture

Hybrid Strategy

Qdrant Docker

VPS Architecture

Deployment Options

10. AI Architecture

Multi-Model Coordination

Core Gateway Mechanics

Prompt Engineering

Error Handling (4-Layer Degradation)

Cost Optimization

11. Type Safety and Data Architecture

Contracts Package

Offline-First

Rate Limits

12. Autonomous Server Ops Agent

Architecture

How It Works

Real-World Example

Why This Is Not Ordinary RAG

Ops Agent Deployment

Ops Agent Environment Variables

Resilience

13. Testing

Unit Tests

E2E Tests

14. Security Best Practices

15. Development Guide

Adding a New Page

Adding a New API

Adding a New Agent

16. Engineering Deep-Dive: What We Actually Built and Why

16.1 Why Multi-Model Routing (Smart Gateway)

16.2 Why This RAG Architecture

16.3 Handling Structured Output Failures

16.4 Rate Limiting and Latency Budgets

16.5 Hazard Detection: False Positive / False Negative Handling

16.6 Hazard Detection Evaluation (First Run)

16.7 Observed Performance (Informal Benchmarks)

16.8 Test Coverage

16.9 What Would Improve with More Time

17. SafeOps Execution Framework

State Machine

3-Tier Permission System

Key Features

Packages