Skip to content

Commit 31295fc

Browse files
author
X
committed
-
1 parent 182da65 commit 31295fc

29 files changed

+6289
-374
lines changed

README.md

Lines changed: 79 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,28 @@
1-
# Reploid: Recursive Self-Improvement Substrate
1+
# REPLOID
22

3-
> A long-running browser-native system that can modify its own code.
3+
> Browser-native sandbox for safe AI agent development and research
44
5-
**R**ecursive **E**volution **P**rotocol **L**oop **O**ptimizing **I**ntelligent **D**REAMER
6-
(**D**ynamic **R**ecursive **E**ngine **A**dapting **M**odules **E**volving **R**EPLOID)
7-
→ REPLOID ↔ DREAMER ↔ ∞
5+
A containment environment for AI agents that can write and execute code. Built for researchers, alignment engineers, and teams building autonomous systems who need **observability, rollback, and human oversight** — not black-box execution.
86

97
---
108

11-
See [AGENTS.md](AGENTS.md) for the active code-writing agent profile.
9+
See [TODO.md](TODO.md) for roadmap | [AGENTS.md](AGENTS.md) for agent profile
1210

13-
## About
11+
## Why REPLOID?
1412

15-
Reploid is a **self-modifying AI substrate** that demonstrates recursive self-improvement ([RSI](https://en.wikipedia.org/wiki/Recursive_self-improvement)) in a browser environment.
13+
AI agents that write code are powerful but dangerous. Most frameworks give agents unrestricted filesystem access, shell execution, or Docker root — then hope nothing goes wrong.
1614

17-
**How:** The agent reads code from its VFS → analyzes & improves it → writes back to VFS → hot-reloads → evolves.
15+
REPLOID takes a different approach: **everything runs in a browser sandbox** with transactional rollback, pre-flight verification, and human approval gates. The agent can modify its own tools, but every mutation is auditable and reversible.
1816

19-
The agent's "brain" is data in [IndexedDB](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API). It can modify this data (its own code) while running.
17+
**Use cases:**
18+
- **AI safety research** — Study agent behavior in a contained environment
19+
- **Model comparison** — Arena mode runs multiple LLMs against the same task, picks the best verified solution
20+
- **Self-modification gating** — Test proposed code changes before committing them
21+
- **Alignment prototyping** — Experiment with oversight patterns before deploying to production
2022

21-
---
23+
## How It Works
24+
25+
The agent operates on a Virtual File System (VFS) backed by IndexedDB. It can read, write, and execute code — but only within the sandbox. All mutations pass through a verification layer that checks for syntax errors, dangerous patterns, and policy violations.
2226

2327
## Architecture
2428

@@ -29,45 +33,55 @@ graph TD
2933
Tools --> VFS[(Virtual File System)]
3034
3135
subgraph Safety Layer
32-
Tools --> Worker(Verification Worker)
33-
Worker -.->|Verify| VFS
36+
Tools --> Verify[Verification Manager]
37+
Verify --> Worker(Web Worker Sandbox)
38+
Worker -.->|Check| VFS
39+
Arena[Arena Harness] --> Verify
3440
end
3541
36-
subgraph Capability
37-
Agent --> Reflection[Reflection Store]
38-
Agent --> Persona[Persona Manager]
42+
subgraph Observability
43+
Agent --> Audit[Audit Logger]
44+
Agent --> Events[Event Bus]
3945
end
4046
```
4147

42-
### Key Components
48+
### Safety First
4349

44-
1. **Core Substrate**:
45-
* `agent-loop.js`: The main cognitive cycle (Think -> Act -> Observe).
46-
* `vfs.js`: Browser-native file system using IndexedDB.
47-
* `llm-client.js`: Unified interface for Cloud (Proxy) and Local (WebLLM) models.
50+
1. **Verification Manager**: All code changes pass through pre-flight checks in an isolated Web Worker. Catches syntax errors, infinite loops, `eval()`, and other dangerous patterns before they reach the VFS.
4851

49-
2. **Safety Mechanisms**:
50-
* **Verification Worker**: Runs proposed code changes in a sandboxed Web Worker to check for syntax errors and malicious patterns (infinite loops, `eval`) before writing to VFS.
51-
* **Genesis Factory**: Creates immutable snapshots ("Lifeboats") of the kernel for recovery.
52+
2. **VFS Snapshots**: Transactional rollback. Capture state before mutations, restore if verification fails. No permanent damage from bad agent decisions.
5253

53-
3. **Tools**:
54-
* `code_intel`: Lightweight structural analysis (imports/exports) to save context tokens.
55-
* `read/write_file`: VFS manipulation.
56-
* `python_tool`: Execute Python via Pyodide (WASM).
54+
3. **Arena Mode**: Test-driven selection for self-modifications. Multiple candidates compete, only verified solutions win. Located in `/testing/arena/`.
5755

58-
---
56+
4. **Circuit Breakers**: Rate limiting and iteration caps (default: 50 cycles) prevent runaway agents. Automatic recovery on failure.
5957

60-
## RSI Levels
58+
5. **Audit Logging**: Every tool call, VFS mutation, and agent decision is logged. Full replay capability for debugging and analysis.
6159

62-
1. **Level 1 (Tools):** Agent creates new tools at runtime using `create_tool`.
63-
2. **Level 2 (Meta):** Agent improves its own tool creation mechanism.
64-
3. **Level 3 (Substrate):** Agent re-architects its entire loop or memory system.
60+
### Core Components
61+
62+
| Component | Purpose |
63+
|-----------|---------|
64+
| `agent-loop.js` | Cognitive cycle (Think → Act → Observe) with circuit breakers |
65+
| `vfs.js` | Browser-native filesystem on IndexedDB |
66+
| `llm-client.js` | Multi-provider LLM abstraction (WebLLM, Ollama, Cloud APIs) |
67+
| `verification-manager.js` | Pre-flight safety checks in sandboxed worker |
68+
| `arena-harness.js` | Competitive selection for code changes |
6569

6670
---
6771

68-
## RSI Examples
72+
## Self-Modification Research
73+
74+
REPLOID is designed to study [recursive self-improvement](https://en.wikipedia.org/wiki/Recursive_self-improvement) (RSI) safely. The agent can modify its own code, but every change is verified, logged, and reversible.
75+
76+
### Modification Levels
6977

70-
### Example 1: Tool Creation (Level 1)
78+
| Level | Description | Safety Gate |
79+
|-------|-------------|-------------|
80+
| **L1: Tools** | Agent creates new tools at runtime | Verification Worker |
81+
| **L2: Meta** | Agent improves its tool-creation mechanism | Arena Mode |
82+
| **L3: Substrate** | Agent modifies core loop or memory | Human Approval (planned) |
83+
84+
### Example: Tool Creation (L1)
7185
**Goal:** "Create a tool that adds two numbers"
7286

7387
```
@@ -86,7 +100,7 @@ graph TD
86100
[Agent] ✓ Goal complete
87101
```
88102

89-
### Example 2: Meta-Tool Creation (Level 2)
103+
### Example: Meta-Tool Creation (L2)
90104
**Goal:** "Build a system that creates tools from descriptions"
91105

92106
```
@@ -114,7 +128,7 @@ graph TD
114128
[Agent] I just created a tool-creating tool! (Level 2 RSI)
115129
```
116130

117-
### Example 3: Substrate Modification (Level 3)
131+
### Example: Substrate Modification (L3)
118132
**Goal:** "Analyze your tool creation process and optimize it"
119133

120134
```
@@ -140,52 +154,48 @@ graph TD
140154

141155
---
142156

143-
## Landscape
157+
## Comparison
144158

145-
Reploid lives in a small but rapidly evolving ecosystem of self-improving agents. We intentionally share compute constraints (browser, IndexedDB) while diverging on safety architecture and ownership.
159+
| Capability | REPLOID | OpenHands | Claude Code | Devin |
160+
|------------|---------|-----------|-------------|-------|
161+
| **Execution** | Browser sandbox | Docker/Linux | Local shell | Cloud SaaS |
162+
| **Rollback** | VFS snapshots | Container reset | Git | N/A |
163+
| **Verification** | Pre-flight checks | None | None | Unknown |
164+
| **Self-modification** | Gated by arena | Unrestricted | N/A | N/A |
165+
| **Offline capable** | Yes (WebLLM) | Yes | Yes | No |
166+
| **Inspectable** | Full source | Full source | Partial | Closed |
146167

147-
### WebLLM (MLC AI)
148-
WebLLM is the inference engine reploid can stand on: deterministic WebGPU execution. It excels at raw token throughput and versioned stability but offers no tools, memory, or self-modification. REPLOID layers VFS, a tool runner, PAWS governance, and substrate/capability boundaries above WebLLM so passive inference becomes an auditable agent capable of planning, testing, and rewriting itself safely.
168+
**REPLOID's niche:** Safe experimentation with self-modifying agents. Not the most powerful agent framework — the most observable and recoverable one.
149169

150-
### OpenHands (formerly OpenDevin)
151-
OpenHands embraces Docker power (shell, compilers, sudo) to tackle arbitrary repos, yet that freedom kills safety—the agent can brick its container with a single bad edit. REPLOID trades GCC for transactional rollback: everything lives inside a browser tab, checkpoints live in IndexedDB, and humans approve cats/dogs diffs before mutations land. We prioritize experimentation accessibility and undo guarantees over unrestricted OS access.
170+
---
152171

153-
### Gödel Agent
154-
Gödel Agent explores theoretical RSI by letting reward functions and logic rewrite themselves. It is fascinating math, but it lacks persistent state management, tooling, or human guardrails, so "reward hacking" is inevitable. REPLOID focuses on engineering: reproducible bundles, hot-reloadable modules, and EventBus-driven UI so observers can inspect every mutation. We sacrifice unconstrained search space for transparency and hands-on controllability.
172+
## Research Questions
155173

156-
### Devin (Cognition)
157-
Devin shows what proprietary, cloud-scale orchestration can deliver: GPT-4-class reasoning, hosted shells, and long-running plans. But it is a black box—you cannot audit, fork, or run Devin offline. REPLOID is the opposite: a glass-box brain stored locally, fully inspectable and modifiable by its owner. We bet that sovereign, user-controlled RSI will outpace closed SaaS once users can watch and influence every self-improvement step.
174+
REPLOID exists to study:
158175

159-
| Feature | REPLOID | OpenHands | Gödel Agent | Devin |
160-
|-----------------------|------------------------|--------------------|-----------------------|----------------|
161-
| Infrastructure | **Browser (WebGPU/IDB)** | Docker/Linux | Python/Research | Cloud SaaS |
162-
| Self-Mod Safety | **High (Worker sandbox + Genesis Kernel)** | Low (root access) | Low (algorithm focus) | N/A (closed) |
163-
| Human Control | **Granular (PAWS review)** | Moderate (Stop btn) | Low (automated) | Moderate (chat)|
164-
| Recovery | **Transactional rollback** | Container reset | Script restart | N/A |
176+
1. **Containment** — Can browser sandboxing provide meaningful safety guarantees for code-writing agents?
177+
2. **Verification** — What static/dynamic checks catch dangerous mutations before execution?
178+
3. **Selection** — Does arena-style competition improve agent outputs vs. single-model generation?
179+
4. **Oversight** — What human-in-the-loop patterns balance safety with agent autonomy?
165180

166-
**Why REPLOID is different:** Explores the "Ship of Theseus" problem in a tab. Capabilities can mutate aggressively, but the substrate remains recoverable thanks to immutable genesis modules, and IndexedDB checkpoints.
181+
These are open questions. REPLOID is infrastructure for exploring them, not answers.
167182

168183
---
169184

170-
## Philosophy
171-
172-
Reploid is an experiment in [**substrate-independent RSI**](https://www.edge.org/response-detail/27126):
185+
## Quick Start
173186

174-
- The agent's "brain" is just data in IndexedDB
175-
- The agent can modify this data (its own code)
176-
- The original source code (genesis) is the evolutionary starting point
177-
- Every agent instance can evolve differently
178-
179-
**Analogy:**
180-
- **DNA** = source code on disk (genesis)
181-
- **Organism** = runtime state in IndexedDB (evolved)
182-
- **Mutations** = agent self-modifications
183-
- **Fitness** = agent-measured improvements (faster, better, smarter)
187+
```bash
188+
git clone https://github.com/clocksmith/reploid
189+
cd reploid
190+
npm install
191+
npm start
192+
# Open http://localhost:3000
193+
```
184194

185-
**Key Question:** Can an AI improve itself faster than humans can improve it?
195+
Select a model, enter a goal, click "Awaken Agent."
186196

187197
---
188198

189199
## License
190200

191-
MIT
201+
MIT — Use freely, but read the safety warnings first.

TODO.md

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
# REPLOID Roadmap
2+
3+
> Agent Safety Substrate — browser-native infrastructure for safe AI agent development
4+
5+
---
6+
7+
## Phase 1: Stabilize Core ✓
8+
9+
- [x] VFS with IndexedDB persistence
10+
- [x] Multi-provider LLM client (WebLLM, Ollama, Cloud APIs)
11+
- [x] Agent loop with 50-iteration circuit breaker
12+
- [x] Tool runner with Web Worker sandboxing
13+
- [x] VerificationManager pre-flight checks
14+
- [x] Rate limiting and circuit breakers
15+
- [x] Genesis levels (tabula/minimal/full/cli)
16+
- [x] Streaming response edge cases (buffer flushing at stream end)
17+
- [x] Circuit breaker half-open state (proper recovery testing)
18+
- [x] LLM stream timeout handling (30s between chunks)
19+
20+
---
21+
22+
## Phase 2: Safety Infrastructure ✓
23+
24+
### 2.1 Human-in-the-Loop Approval (Opt-in)
25+
26+
Autonomous by default. HITL is opt-in for users who want approval gates.
27+
28+
- [x] Implement HITL controller (`/infrastructure/hitl-controller.js`)
29+
- [x] Module registration with capabilities (APPROVE_CORE_WRITES, etc.)
30+
- [x] Approval queue with callbacks, timeouts, statistics
31+
- [x] Diff viewer for proposed changes (`/ui/components/diff-viewer-ui.js`)
32+
- [x] UI widget for approval queue (`/ui/components/hitl-widget.js`)
33+
34+
### 2.2 Audit Logging Integration
35+
36+
- [x] Wire AuditLogger into ToolRunner
37+
- [x] Log all tool executions (name, args, duration, success/error)
38+
- [x] Log VFS mutations with before/after byte counts
39+
- [x] Core file writes logged with WARN severity
40+
- [x] Structured audit export (JSON/CSV) via `AuditLogger.exportJSON()` / `exportCSV()`
41+
- [x] Download audit logs via `AuditLogger.download('json')` / `download('csv')`
42+
- [ ] Implement audit replay for debugging
43+
44+
### 2.3 Arena Mode (Test-Driven Selection)
45+
46+
- [x] VFSSandbox — snapshot/restore isolation
47+
- [x] ArenaCompetitor — competitor definition
48+
- [x] ArenaMetrics — results ranking
49+
- [x] ArenaHarness — competition orchestrator
50+
- [x] Wire arena into ToolRunner for self-mod gating (opt-in via `setArenaGating(true)`)
51+
- [ ] Integration tests for arena harness
52+
- [ ] UI for arena results visualization
53+
54+
---
55+
56+
## Phase 3: Trust Building ✓
57+
58+
### 3.1 Verification Hardening
59+
60+
- [x] Expand VerificationManager patterns (20+ dangerous patterns)
61+
- [x] Pattern-based static analysis for dangerous code
62+
- [x] Capability-based permissions (`/tools/` can only write to `/tools/`, `/apps/`, `/.logs/`)
63+
- [x] Complexity heuristics (warn on large files, many functions)
64+
- [ ] Add cryptographic signing for approved modules
65+
66+
### 3.2 Genesis Factory
67+
68+
- [x] Genesis snapshot system (`/infrastructure/genesis-snapshot.js`)
69+
- [x] "Lifeboat" immutable kernel backups (localStorage)
70+
- [x] One-click rollback via `restoreSnapshot()` / `restoreFromLifeboat()`
71+
- [x] Export/import genesis bundles
72+
73+
### 3.3 Observability
74+
75+
- [x] Real-time mutation stream (`Observability.recordMutation()`)
76+
- [x] Agent decision trace (`Observability.recordDecision()`)
77+
- [x] Token usage and cost tracking with per-model breakdown
78+
- [x] Performance metrics (LLM latency, tool latency, error rate)
79+
- [x] Full dashboard via `Observability.getDashboard()`
80+
81+
---
82+
83+
## Phase 4: External Validation
84+
85+
- [ ] Security audit of sandbox boundaries
86+
- [ ] Publish safety primitives as standalone library
87+
- [ ] Academic paper on browser-native agent containment
88+
- [ ] Compliance documentation (SOC2-style controls)
89+
90+
---
91+
92+
## Optional: Moonshots
93+
94+
These are high-value but high-effort. Pursue only after Phase 3.
95+
96+
### Policy Engine
97+
98+
- [ ] Upgrade RuleEngine from stub to real policy enforcement
99+
- [ ] Define declarative safety policies (e.g., "no network calls from tools")
100+
- [ ] Runtime policy violation detection
101+
102+
### Formal Verification
103+
104+
- [ ] Type-level guarantees for tool outputs
105+
- [ ] Proof-carrying code for self-modifications
106+
- [ ] Invariant checking across mutations
107+
108+
### Multi-Agent Coordination
109+
110+
- [ ] Swarm orchestration (`blueprints/0x000034-swarm-orchestration.md`)
111+
- [ ] Cross-tab coordination (`blueprints/0x00003A-tab-coordination.md`)
112+
- [ ] Consensus protocols for distributed agents
113+
114+
### WebRTC P2P
115+
116+
- [ ] Peer-to-peer agent communication (`blueprints/0x00003E-webrtc-swarm-transport.md`)
117+
- [ ] Distributed VFS sync
118+
- [ ] Federated learning primitives
119+
120+
---
121+
122+
## Not Planned
123+
124+
These are explicitly out of scope:
125+
126+
- **Docker/OS access** — Browser sandbox is the security boundary
127+
- **Unrestricted self-modification** — Always gated by verification
128+
- **Autonomous deployment** — Human approval required for production changes
129+
130+
---
131+
132+
## Metrics for Success
133+
134+
| Metric | Target | Current |
135+
|--------|--------|---------|
136+
| Core module test coverage | >80% | ~40% |
137+
| Mean time to recovery (bad mutation) | <5s | ~30s |
138+
| HITL adoption (users who opt-in) | tracked | ready |
139+
| Audit log completeness | 100% | ~95% |
140+
| Arena pass rate (self-mod gating) | >90% | ready |
141+
142+
---
143+
144+
## Timeline Estimate
145+
146+
No dates — these are sequenced priorities:
147+
148+
1. **Phase 1** — stabilization ✓
149+
2. **Phase 2** — safety infrastructure ✓
150+
3. **Phase 3** — trust building ✓
151+
4. **Phase 4** — validation (current)
152+
153+
Fund with existing revenue. No external pressure on timelines.

0 commit comments

Comments
 (0)