clocksmith
diff --git a/‎README.md‎
Lines changed: 79 additions & 69 deletions b/‎README.md‎
Lines changed: 79 additions & 69 deletions
diff --git a/‎TODO.md‎
Lines changed: 153 additions & 0 deletions b/‎TODO.md‎
Lines changed: 153 additions & 0 deletions
@@ -1,24 +1,28 @@
-# Reploid: Recursive Self-Improvement Substrate
+# REPLOID
 
-> A long-running browser-native system that can modify its own code.
+> Browser-native sandbox for safe AI agent development and research
 
-**R**ecursive **E**volution **P**rotocol **L**oop **O**ptimizing **I**ntelligent **D**REAMER
-(**D**ynamic **R**ecursive **E**ngine **A**dapting **M**odules **E**volving **R**EPLOID)
-→ REPLOID ↔ DREAMER ↔ ∞
+A containment environment for AI agents that can write and execute code. Built for researchers, alignment engineers, and teams building autonomous systems who need **observability, rollback, and human oversight** — not black-box execution.
 
 ---
 
-See [AGENTS.md](AGENTS.md) for the active code-writing agent profile.
+See [TODO.md](TODO.md) for roadmap | [AGENTS.md](AGENTS.md) for agent profile
 
-## About
+## Why REPLOID?
 
-Reploid is a **self-modifying AI substrate** that demonstrates recursive self-improvement ([RSI](https://en.wikipedia.org/wiki/Recursive_self-improvement)) in a browser environment.
+AI agents that write code are powerful but dangerous. Most frameworks give agents unrestricted filesystem access, shell execution, or Docker root — then hope nothing goes wrong.
 
-**How:** The agent reads code from its VFS → analyzes & improves it → writes back to VFS → hot-reloads → evolves.
+REPLOID takes a different approach: **everything runs in a browser sandbox** with transactional rollback, pre-flight verification, and human approval gates. The agent can modify its own tools, but every mutation is auditable and reversible.
 
-The agent's "brain" is data in [IndexedDB](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API). It can modify this data (its own code) while running.
+**Use cases:**
+- **AI safety research** — Study agent behavior in a contained environment
+- **Model comparison** — Arena mode runs multiple LLMs against the same task, picks the best verified solution
+- **Self-modification gating** — Test proposed code changes before committing them
+- **Alignment prototyping** — Experiment with oversight patterns before deploying to production
 
----
+## How It Works
+
+The agent operates on a Virtual File System (VFS) backed by IndexedDB. It can read, write, and execute code — but only within the sandbox. All mutations pass through a verification layer that checks for syntax errors, dangerous patterns, and policy violations.
 
 ## Architecture
 
@@ -29,45 +33,55 @@ graph TD
     Tools --> VFS[(Virtual File System)]
 
     subgraph Safety Layer
-        Tools --> Worker(Verification Worker)
-        Worker -.->|Verify| VFS
+        Tools --> Verify[Verification Manager]
+        Verify --> Worker(Web Worker Sandbox)
+        Worker -.->|Check| VFS
+        Arena[Arena Harness] --> Verify
     end
 
-    subgraph Capability
-        Agent --> Reflection[Reflection Store]
-        Agent --> Persona[Persona Manager]
+    subgraph Observability
+        Agent --> Audit[Audit Logger]
+        Agent --> Events[Event Bus]
     end
 ```
 
-### Key Components
+### Safety First
 
-1.  **Core Substrate**:
-    *   `agent-loop.js`: The main cognitive cycle (Think -> Act -> Observe).
-    *   `vfs.js`: Browser-native file system using IndexedDB.
-    *   `llm-client.js`: Unified interface for Cloud (Proxy) and Local (WebLLM) models.
+1.  **Verification Manager**: All code changes pass through pre-flight checks in an isolated Web Worker. Catches syntax errors, infinite loops, `eval()`, and other dangerous patterns before they reach the VFS.
 
-2.  **Safety Mechanisms**:
-    *   **Verification Worker**: Runs proposed code changes in a sandboxed Web Worker to check for syntax errors and malicious patterns (infinite loops, `eval`) before writing to VFS.
-    *   **Genesis Factory**: Creates immutable snapshots ("Lifeboats") of the kernel for recovery.
+2.  **VFS Snapshots**: Transactional rollback. Capture state before mutations, restore if verification fails. No permanent damage from bad agent decisions.
 
-3.  **Tools**:
-    *   `code_intel`: Lightweight structural analysis (imports/exports) to save context tokens.
-    *   `read/write_file`: VFS manipulation.
-    *   `python_tool`: Execute Python via Pyodide (WASM).
+3.  **Arena Mode**: Test-driven selection for self-modifications. Multiple candidates compete, only verified solutions win. Located in `/testing/arena/`.
 
----
+4.  **Circuit Breakers**: Rate limiting and iteration caps (default: 50 cycles) prevent runaway agents. Automatic recovery on failure.
 
-## RSI Levels
+5.  **Audit Logging**: Every tool call, VFS mutation, and agent decision is logged. Full replay capability for debugging and analysis.
 
-1.  **Level 1 (Tools):** Agent creates new tools at runtime using `create_tool`.
-2.  **Level 2 (Meta):** Agent improves its own tool creation mechanism.
-3.  **Level 3 (Substrate):** Agent re-architects its entire loop or memory system.
+### Core Components
+
+| Component | Purpose |
+|-----------|---------|
+| `agent-loop.js` | Cognitive cycle (Think → Act → Observe) with circuit breakers |
+| `vfs.js` | Browser-native filesystem on IndexedDB |
+| `llm-client.js` | Multi-provider LLM abstraction (WebLLM, Ollama, Cloud APIs) |
+| `verification-manager.js` | Pre-flight safety checks in sandboxed worker |
+| `arena-harness.js` | Competitive selection for code changes |
 
 ---
 
-## RSI Examples
+## Self-Modification Research
+
+REPLOID is designed to study [recursive self-improvement](https://en.wikipedia.org/wiki/Recursive_self-improvement) (RSI) safely. The agent can modify its own code, but every change is verified, logged, and reversible.
+
+### Modification Levels
 
-### Example 1: Tool Creation (Level 1)
+| Level | Description | Safety Gate |
+|-------|-------------|-------------|
+| **L1: Tools** | Agent creates new tools at runtime | Verification Worker |
+| **L2: Meta** | Agent improves its tool-creation mechanism | Arena Mode |
+| **L3: Substrate** | Agent modifies core loop or memory | Human Approval (planned) |
+
+### Example: Tool Creation (L1)
 **Goal:** "Create a tool that adds two numbers"
 
 ```
@@ -86,7 +100,7 @@ graph TD
 [Agent] ✓ Goal complete
 ```
 
-### Example 2: Meta-Tool Creation (Level 2)
+### Example: Meta-Tool Creation (L2)
 **Goal:** "Build a system that creates tools from descriptions"
 
 ```
@@ -114,7 +128,7 @@ graph TD
 [Agent] I just created a tool-creating tool! (Level 2 RSI)
 ```
 
-### Example 3: Substrate Modification (Level 3)
+### Example: Substrate Modification (L3)
 **Goal:** "Analyze your tool creation process and optimize it"
 
 ```
@@ -140,52 +154,48 @@ graph TD
 
 ---
 
-## Landscape
+## Comparison
 
-Reploid lives in a small but rapidly evolving ecosystem of self-improving agents. We intentionally share compute constraints (browser, IndexedDB) while diverging on safety architecture and ownership.
+| Capability | REPLOID | OpenHands | Claude Code | Devin |
+|------------|---------|-----------|-------------|-------|
+| **Execution** | Browser sandbox | Docker/Linux | Local shell | Cloud SaaS |
+| **Rollback** | VFS snapshots | Container reset | Git | N/A |
+| **Verification** | Pre-flight checks | None | None | Unknown |
+| **Self-modification** | Gated by arena | Unrestricted | N/A | N/A |
+| **Offline capable** | Yes (WebLLM) | Yes | Yes | No |
+| **Inspectable** | Full source | Full source | Partial | Closed |
 
-### WebLLM (MLC AI)
-WebLLM is the inference engine reploid can stand on: deterministic WebGPU execution. It excels at raw token throughput and versioned stability but offers no tools, memory, or self-modification. REPLOID layers VFS, a tool runner, PAWS governance, and substrate/capability boundaries above WebLLM so passive inference becomes an auditable agent capable of planning, testing, and rewriting itself safely.
+**REPLOID's niche:** Safe experimentation with self-modifying agents. Not the most powerful agent framework — the most observable and recoverable one.
 
-### OpenHands (formerly OpenDevin)
-OpenHands embraces Docker power (shell, compilers, sudo) to tackle arbitrary repos, yet that freedom kills safety—the agent can brick its container with a single bad edit. REPLOID trades GCC for transactional rollback: everything lives inside a browser tab, checkpoints live in IndexedDB, and humans approve cats/dogs diffs before mutations land. We prioritize experimentation accessibility and undo guarantees over unrestricted OS access.
+---
 
-### Gödel Agent
-Gödel Agent explores theoretical RSI by letting reward functions and logic rewrite themselves. It is fascinating math, but it lacks persistent state management, tooling, or human guardrails, so "reward hacking" is inevitable. REPLOID focuses on engineering: reproducible bundles, hot-reloadable modules, and EventBus-driven UI so observers can inspect every mutation. We sacrifice unconstrained search space for transparency and hands-on controllability.
+## Research Questions
 
-### Devin (Cognition)
-Devin shows what proprietary, cloud-scale orchestration can deliver: GPT-4-class reasoning, hosted shells, and long-running plans. But it is a black box—you cannot audit, fork, or run Devin offline. REPLOID is the opposite: a glass-box brain stored locally, fully inspectable and modifiable by its owner. We bet that sovereign, user-controlled RSI will outpace closed SaaS once users can watch and influence every self-improvement step.
+REPLOID exists to study:
 
-| Feature               | REPLOID                | OpenHands          | Gödel Agent           | Devin          |
-|-----------------------|------------------------|--------------------|-----------------------|----------------|
-| Infrastructure        | **Browser (WebGPU/IDB)** | Docker/Linux       | Python/Research       | Cloud SaaS     |
-| Self-Mod Safety       | **High (Worker sandbox + Genesis Kernel)** | Low (root access)  | Low (algorithm focus) | N/A (closed)   |
-| Human Control         | **Granular (PAWS review)**   | Moderate (Stop btn) | Low (automated)        | Moderate (chat)|
-| Recovery              | **Transactional rollback**  | Container reset   | Script restart        | N/A            |
+1. **Containment** — Can browser sandboxing provide meaningful safety guarantees for code-writing agents?
+2. **Verification** — What static/dynamic checks catch dangerous mutations before execution?
+3. **Selection** — Does arena-style competition improve agent outputs vs. single-model generation?
+4. **Oversight** — What human-in-the-loop patterns balance safety with agent autonomy?
 
-**Why REPLOID is different:** Explores the "Ship of Theseus" problem in a tab. Capabilities can mutate aggressively, but the substrate remains recoverable thanks to immutable genesis modules, and IndexedDB checkpoints.
+These are open questions. REPLOID is infrastructure for exploring them, not answers.
 
 ---
 
-## Philosophy
-
-Reploid is an experiment in [**substrate-independent RSI**](https://www.edge.org/response-detail/27126):
+## Quick Start
 
-- The agent's "brain" is just data in IndexedDB
-- The agent can modify this data (its own code)
-- The original source code (genesis) is the evolutionary starting point
-- Every agent instance can evolve differently
-
-**Analogy:**
-- **DNA** = source code on disk (genesis)
-- **Organism** = runtime state in IndexedDB (evolved)
-- **Mutations** = agent self-modifications
-- **Fitness** = agent-measured improvements (faster, better, smarter)
+```bash
+git clone https://github.com/clocksmith/reploid
+cd reploid
+npm install
+npm start
+# Open http://localhost:3000
+```
 
-**Key Question:** Can an AI improve itself faster than humans can improve it?
+Select a model, enter a goal, click "Awaken Agent."
 
 ---
 
 ## License
 
-MIT
+MIT — Use freely, but read the safety warnings first.
@@ -0,0 +1,153 @@
+# REPLOID Roadmap
+
+> Agent Safety Substrate — browser-native infrastructure for safe AI agent development
+
+---
+
+## Phase 1: Stabilize Core ✓
+
+- [x] VFS with IndexedDB persistence
+- [x] Multi-provider LLM client (WebLLM, Ollama, Cloud APIs)
+- [x] Agent loop with 50-iteration circuit breaker
+- [x] Tool runner with Web Worker sandboxing
+- [x] VerificationManager pre-flight checks
+- [x] Rate limiting and circuit breakers
+- [x] Genesis levels (tabula/minimal/full/cli)
+- [x] Streaming response edge cases (buffer flushing at stream end)
+- [x] Circuit breaker half-open state (proper recovery testing)
+- [x] LLM stream timeout handling (30s between chunks)
+
+---
+
+## Phase 2: Safety Infrastructure ✓
+
+### 2.1 Human-in-the-Loop Approval (Opt-in)
+
+Autonomous by default. HITL is opt-in for users who want approval gates.
+
+- [x] Implement HITL controller (`/infrastructure/hitl-controller.js`)
+- [x] Module registration with capabilities (APPROVE_CORE_WRITES, etc.)
+- [x] Approval queue with callbacks, timeouts, statistics
+- [x] Diff viewer for proposed changes (`/ui/components/diff-viewer-ui.js`)
+- [x] UI widget for approval queue (`/ui/components/hitl-widget.js`)
+
+### 2.2 Audit Logging Integration
+
+- [x] Wire AuditLogger into ToolRunner
+- [x] Log all tool executions (name, args, duration, success/error)
+- [x] Log VFS mutations with before/after byte counts
+- [x] Core file writes logged with WARN severity
+- [x] Structured audit export (JSON/CSV) via `AuditLogger.exportJSON()` / `exportCSV()`
+- [x] Download audit logs via `AuditLogger.download('json')` / `download('csv')`
+- [ ] Implement audit replay for debugging
+
+### 2.3 Arena Mode (Test-Driven Selection)
+
+- [x] VFSSandbox — snapshot/restore isolation
+- [x] ArenaCompetitor — competitor definition
+- [x] ArenaMetrics — results ranking
+- [x] ArenaHarness — competition orchestrator
+- [x] Wire arena into ToolRunner for self-mod gating (opt-in via `setArenaGating(true)`)
+- [ ] Integration tests for arena harness
+- [ ] UI for arena results visualization
+
+---
+
+## Phase 3: Trust Building ✓
+
+### 3.1 Verification Hardening
+
+- [x] Expand VerificationManager patterns (20+ dangerous patterns)
+- [x] Pattern-based static analysis for dangerous code
+- [x] Capability-based permissions (`/tools/` can only write to `/tools/`, `/apps/`, `/.logs/`)
+- [x] Complexity heuristics (warn on large files, many functions)
+- [ ] Add cryptographic signing for approved modules
+
+### 3.2 Genesis Factory
+
+- [x] Genesis snapshot system (`/infrastructure/genesis-snapshot.js`)
+- [x] "Lifeboat" immutable kernel backups (localStorage)
+- [x] One-click rollback via `restoreSnapshot()` / `restoreFromLifeboat()`
+- [x] Export/import genesis bundles
+
+### 3.3 Observability
+
+- [x] Real-time mutation stream (`Observability.recordMutation()`)
+- [x] Agent decision trace (`Observability.recordDecision()`)
+- [x] Token usage and cost tracking with per-model breakdown
+- [x] Performance metrics (LLM latency, tool latency, error rate)
+- [x] Full dashboard via `Observability.getDashboard()`
+
+---
+
+## Phase 4: External Validation
+
+- [ ] Security audit of sandbox boundaries
+- [ ] Publish safety primitives as standalone library
+- [ ] Academic paper on browser-native agent containment
+- [ ] Compliance documentation (SOC2-style controls)
+
+---
+
+## Optional: Moonshots
+
+These are high-value but high-effort. Pursue only after Phase 3.
+
+### Policy Engine
+
+- [ ] Upgrade RuleEngine from stub to real policy enforcement
+- [ ] Define declarative safety policies (e.g., "no network calls from tools")
+- [ ] Runtime policy violation detection
+
+### Formal Verification
+
+- [ ] Type-level guarantees for tool outputs
+- [ ] Proof-carrying code for self-modifications
+- [ ] Invariant checking across mutations
+
+### Multi-Agent Coordination
+
+- [ ] Swarm orchestration (`blueprints/0x000034-swarm-orchestration.md`)
+- [ ] Cross-tab coordination (`blueprints/0x00003A-tab-coordination.md`)
+- [ ] Consensus protocols for distributed agents
+
+### WebRTC P2P
+
+- [ ] Peer-to-peer agent communication (`blueprints/0x00003E-webrtc-swarm-transport.md`)
+- [ ] Distributed VFS sync
+- [ ] Federated learning primitives
+
+---
+
+## Not Planned
+
+These are explicitly out of scope:
+
+- **Docker/OS access** — Browser sandbox is the security boundary
+- **Unrestricted self-modification** — Always gated by verification
+- **Autonomous deployment** — Human approval required for production changes
+
+---
+
+## Metrics for Success
+
+| Metric | Target | Current |
+|--------|--------|---------|
+| Core module test coverage | >80% | ~40% |
+| Mean time to recovery (bad mutation) | <5s | ~30s |
+| HITL adoption (users who opt-in) | tracked | ready |
+| Audit log completeness | 100% | ~95% |
+| Arena pass rate (self-mod gating) | >90% | ready |
+
+---
+
+## Timeline Estimate
+
+No dates — these are sequenced priorities:
+
+1. **Phase 1** — stabilization ✓
+2. **Phase 2** — safety infrastructure ✓
+3. **Phase 3** — trust building ✓
+4. **Phase 4** — validation (current)
+
+Fund with existing revenue. No external pressure on timelines.