Skip to content

Commit 4f381b0

Browse files
author
X
committed
docs: Simplify README for technical audience
Remove specific file path references while keeping technical architecture and concepts. Focus on how the system works rather than implementation details.
1 parent e88ef8e commit 4f381b0

File tree

1 file changed

+55
-127
lines changed

1 file changed

+55
-127
lines changed

README.md

Lines changed: 55 additions & 127 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,13 @@ A containment environment for AI agents that can write and execute code. Built f
99
| Mode | How to Run | Notes |
1010
|------|------------|-------|
1111
| **Fully local** | `npm install && npm run dev` | No network needed after hydration. Uses local models (Ollama, WebLLM). |
12-
| **Hosted UI (https://replo.id)** | Open the site directly | Use WebLLM, cloud API keys, or point the UI at your **local** proxy by enabling CORS on that proxy. |
13-
| **Hybrid** | Hosted UI + local proxy | Run `npm start` locally, set `CORS_ORIGINS="https://replo.id,https://your-domain.example"` (or configure `server.corsOrigins`), then connect the hosted UI to `http://127.0.0.1:8000`. |
12+
| **Hosted UI (https://replo.id)** | Open the site directly | Use WebLLM, cloud API keys, or connect to your local proxy. |
13+
| **Hybrid** | Hosted UI + local proxy | Run `npm start` locally with CORS configured, then connect the hosted UI. |
1414

15-
In all modes, the agent still runs in your browser. The proxy is only needed if you want to access local model servers (e.g., Ollama) from the hosted UI or route cloud API calls through your own machine.
15+
In all modes, the agent runs in your browser. The proxy is only needed to access local model servers from the hosted UI or route cloud API calls through your machine.
1616

1717
---
1818

19-
See [TODO.md](TODO.md) for roadmap | [AGENTS.md](AGENTS.md) for agent profile
20-
2119
## Why REPLOID?
2220

2321
AI agents that write code are powerful but dangerous. Most frameworks give agents unrestricted filesystem access, shell execution, or Docker root — then hope nothing goes wrong.
@@ -30,10 +28,6 @@ REPLOID takes a different approach: **everything runs in a browser sandbox** wit
3028
- **Self-modification gating** — Test proposed code changes before committing them
3129
- **Alignment prototyping** — Experiment with oversight patterns before deploying to production
3230

33-
## How It Works
34-
35-
The agent operates on a Virtual File System (VFS) backed by IndexedDB. It can read, write, and execute code — but only within the sandbox. All mutations pass through a verification layer that checks for syntax errors, dangerous patterns, and policy violations.
36-
3731
## Architecture
3832

3933
```mermaid
@@ -55,106 +49,64 @@ graph TD
5549
end
5650
```
5751

58-
### Safety First
59-
60-
1. **Genesis Snapshot at Boot**: Full VFS snapshot captured immediately after hydration, before any user action. Enables offline rollback to pristine state—no network required for recovery.
52+
### How It Works
6153

62-
2. **Verification Manager**: All code changes pass through pre-flight checks in an isolated Web Worker. Catches syntax errors, infinite loops, `eval()`, and other dangerous patterns before they reach the VFS.
54+
The agent operates on a **Virtual File System (VFS)** backed by IndexedDB. It can read, write, and execute code — but only within the sandbox. All mutations pass through a verification layer that checks for syntax errors, dangerous patterns, and policy violations.
6355

64-
3. **VFS Snapshots**: Transactional rollback. Capture state before mutations, restore if verification fails. No permanent damage from bad agent decisions.
56+
**Core execution loop:**
57+
1. **Think** — Agent analyzes context and decides next action
58+
2. **Act** — Tool call executed against VFS
59+
3. **Observe** — Results captured and fed back to agent
6560

66-
4. **Arena Mode**: Test-driven selection for self-modifications. Multiple candidates compete, only verified solutions win. Located in `/testing/arena/`.
61+
**Key subsystems:**
62+
- **Agent Loop** — Cognitive cycle with circuit breakers (default: 50 iterations max)
63+
- **Virtual File System** — Browser-native filesystem on IndexedDB with snapshot/restore
64+
- **LLM Client** — Multi-provider abstraction (WebLLM, Ollama, OpenAI, Anthropic, Google, Groq)
65+
- **Worker Manager** — Multi-worker orchestration with permission tiers
66+
- **Tool Runner** — Dynamic tool loading with arena gating for self-modifications
67+
- **Verification Manager** — Pre-flight safety checks in isolated Web Worker
6768

68-
5. **Circuit Breakers**: Rate limiting and iteration caps (default: 50 cycles) prevent runaway agents. Automatic recovery on failure.
69+
### Safety Mechanisms
6970

70-
6. **Audit Logging**: Every tool call, VFS mutation, and agent decision is logged. Full replay capability for debugging and analysis.
71+
1. **Genesis Snapshot** — Full VFS snapshot captured at boot, before any user action. Enables offline rollback to pristine state.
7172

72-
7. **Service Worker Module Loader**: All ES6 imports intercepted and served from VFS (IndexedDB). Once hydrated, the agent runs entirely offline. Entry points (`boot.js`, `index.html`) stay on network for clean genesis boundaries.
73+
2. **Pre-flight Verification**All code changes pass through isolated Web Worker. Catches syntax errors, infinite loops, `eval()`, and dangerous patterns before reaching VFS.
7374

74-
8. **Genesis Diff Visualization**: Color-coded comparison showing all changes from initial state (green = added, yellow = modified, red = deleted). Instant visibility into what the agent has modified.
75+
3. **Transactional Rollback** — VFS snapshots before mutations, restores on verification failure. No permanent damage from bad agent decisions.
7576

76-
### Core Components
77+
4. **Arena Mode** — Test-driven selection for self-modifications. Multiple candidates compete, only verified solutions win.
7778

78-
| Component | Purpose |
79-
|-----------|---------|
80-
| `agent-loop.js` | Cognitive cycle (Think → Act → Observe) with circuit breakers |
81-
| `vfs.js` | Browser-native filesystem on IndexedDB |
82-
| `llm-client.js` | Multi-provider LLM abstraction (WebLLM, Ollama, Cloud APIs) |
83-
| `worker-manager.js` | Multi-worker orchestration with permission tiers |
84-
| `tool-runner.js` | Dynamic tool loading and execution with arena gating |
85-
| `verification-manager.js` | Pre-flight safety checks in sandboxed worker |
86-
| `persona-manager.js` | System prompt customization per genesis level |
87-
| `arena-harness.js` | Competitive selection for code changes |
79+
5. **Circuit Breakers** — Rate limiting and iteration caps prevent runaway agents. Automatic recovery on failure.
8880

89-
### Proto UI
81+
6. **Audit Logging** — Every tool call, VFS mutation, and agent decision logged. Full replay capability.
9082

91-
The Proto interface (`ui/proto.js`) provides full observability:
83+
7. **Service Worker Isolation** — All ES6 imports intercepted and served from VFS. Once hydrated, the agent runs entirely offline.
9284

93-
| Tab | Purpose |
94-
|-----|---------|
95-
| **History** | LLM responses, tool calls, streaming output |
96-
| **Reflections** | Agent learning entries with success/error status |
97-
| **Status** | Agent state, token usage, error log |
98-
| **Workers** | Active/completed workers, per-worker logs |
99-
| **Debug** | System prompt, conversation context, model config |
100-
101-
Additional features: VFS browser with diff/preview, command palette (Ctrl+K), Genesis snapshot management.
85+
8. **Genesis Diff Visualization** — Color-coded comparison showing all changes from initial state (green=added, yellow=modified, red=deleted).
10286

10387
### Multi-Worker Orchestration
10488

105-
The WorkerManager enables parallel task execution through permission-filtered subagents:
89+
The system enables parallel task execution through permission-filtered subagents:
10690

10791
| Worker Type | Permissions | Use Case |
10892
|-------------|-------------|----------|
109-
| **explore** | Read-only (ReadFile, ListFiles, Grep, Find) | Codebase analysis |
93+
| **explore** | Read-only | Codebase analysis |
11094
| **analyze** | Read + JSON tools | Data processing |
11195
| **execute** | Full tool access | Task execution |
11296

113-
**Model Roles:** Each worker can use a different model role (orchestrator, fast, code, local) for cost optimization.
114-
115-
**Worker Tools:**
116-
- `SpawnWorker` — Create a new worker with type, task, and optional model role
117-
- `ListWorkers` — View active and completed workers
118-
- `AwaitWorkers` — Wait for specific workers or all to complete
119-
120-
Workers run in a flat hierarchy (no worker can spawn workers) and all actions flow through the same audit pipeline.
121-
122-
### Available Tools
97+
Each worker can use a different model role (orchestrator, fast, code, local) for cost optimization. Workers run in a flat hierarchy (no nested spawning) and all actions flow through the audit pipeline.
12398

124-
**All tools are dynamic** — loaded from `/tools/` at boot. No hardcoded tools means full RSI capability: the agent can modify any tool, including core file operations. All tool names use CamelCase (e.g., ReadFile, Grep, CreateTool) to keep the interface consistent.
99+
### Tool System
125100

126-
**Core VFS Operations:**
127-
- `ReadFile`, `WriteFile`, `ListFiles`, `DeleteFile` — VFS operations with audit logging
101+
All tools are **dynamically loaded** at boot. No hardcoded tools means full RSI capability: the agent can modify any tool, including core file operations.
128102

129-
**Meta-Tools (RSI):**
130-
- `CreateTool` — Dynamic tool creation at runtime (L1 RSI)
131-
- `LoadModule` — Hot-reload modules from VFS
132-
- `ListTools` — Discover available tools
133-
- `Edit`Apply literal match/replacement edits to files
103+
**Tool categories:**
104+
- **VFS Operations** — Read, write, list, delete files with audit logging
105+
- **Meta-Tools (RSI)** — Create new tools at runtime, hot-reload modules
106+
- **Worker Tools** — Spawn subagents, list/await workers
107+
- **Utilities**Grep, find, sed, jq, git (VFS-scoped shim)
134108

135-
**Worker Tools:**
136-
- `SpawnWorker` — Spawn permission-filtered subagent
137-
- `ListWorkers` — List active/completed workers
138-
- `AwaitWorkers` — Wait for worker completion
139-
140-
**Utilities:**
141-
- `FileOutline` — Analyze file structure without reading content
142-
- `Cat`, `Head`, `Tail`, `Ls`, `Pwd`, `Touch` — Familiar filesystem navigation primitives
143-
- `Grep`, `Find`, `Sed`, `Jq` — Search, filter, and transform file contents
144-
- `Git` — Version control operations (VFS-scoped shim)
145-
- `Mkdir`, `Rm`, `Mv`, `Cp` — File management
146-
147-
All tools operate within the VFS sandbox with no access to host filesystem. Tools receive a `deps` object with VFS, EventBus, ToolWriter, WorkerManager, and other modules for full capability.
148-
149-
---
150-
151-
## Why JavaScript, Not TypeScript?
152-
153-
REPLOID is pure JavaScript because the agent generates, modifies, and executes code at runtime—entirely in the browser. TypeScript requires compilation, but there's no Node.js or build toolchain in-browser.
154-
155-
When the agent writes a new tool to the VFS, the Service Worker immediately serves it as an ES module. No compilation step, no latency. TypeScript would require bundling a 10MB+ compiler or maintaining separate source/artifact trees—defeating the self-modification model.
156-
157-
Runtime safety comes from verification (syntax checks, sandboxed execution, arena testing), not static types. The `[SW]` logs show this: modules loading from VFS, no build step.
109+
All tools operate within the VFS sandbox with no access to host filesystem.
158110

159111
---
160112

@@ -170,34 +122,11 @@ REPLOID is designed to study [recursive self-improvement](https://en.wikipedia.o
170122
| **L2: Meta** | Agent improves its tool-creation mechanism | Arena Mode |
171123
| **L3: Substrate** | Agent modifies core loop or memory | Human Approval (planned) |
172124

173-
### Example: Tool Creation (L1)
174-
**Goal:** "Create a tool that adds two numbers"
175-
176-
```
177-
[Agent] Iteration 1: Analyzing goal...
178-
[Agent] I need a tool to add numbers
179-
[TOOL_CALL] CreateTool
180-
[ARGS] {
181-
name: "AddNumbers",
182-
code: "export default async function({a, b}) { return a + b; }"
183-
}
184-
[TOOL_RESULT] ✓ Tool created: /tools/AddNumbers.js
185-
[Agent] Testing new tool...
186-
[TOOL_CALL] AddNumbers
187-
[ARGS] {a: 5, b: 3}
188-
[TOOL_RESULT] 8
189-
[Agent] ✓ Goal complete
190-
```
125+
**L1 Example:** Agent creates an "AddNumbers" tool, writes it to VFS, tests it, confirms it works.
191126

192-
### Example: Meta-Tool Creation (L2)
193-
**Goal:** "Build a system that creates tools from descriptions"
127+
**L2 Example:** Agent creates a "CreateToolFromDescription" tool that uses the LLM to generate code, then persists via the tool-creation mechanism. A tool that makes tools.
194128

195-
Agent creates `CreateToolFromDescription` which calls the LLM to generate code, then calls `CreateTool` to persist it. A tool that makes tools.
196-
197-
### Example: Substrate Modification (L3)
198-
**Goal:** "Optimize your tool creation process"
199-
200-
Agent reads `/core/tool-writer.js`, identifies a bottleneck, writes an improved version with `WriteFile`, and hot-reloads via `LoadModule`. Self-modification of core infrastructure.
129+
**L3 Example:** Agent reads its own core modules, identifies a bottleneck, writes an improved version, and hot-reloads it. Self-modification of core infrastructure.
201130

202131
---
203132

@@ -213,9 +142,18 @@ Agent reads `/core/tool-writer.js`, identifies a bottleneck, writes an improved
213142
| **Offline capable** | Yes (WebLLM) | Yes | Yes | No |
214143
| **Multi-model** | 6+ providers | Limited | Claude only | Unknown |
215144
| **Subagents** | Worker tiers | N/A | Task tool | Unknown |
216-
| **Inspectable** | Full source | Full source | Partial | Closed |
217145

218-
**REPLOID's niche:** Safe experimentation with self-modifying agents. Not the most powerful agent framework — the most observable and recoverable one. Unique advantages: multi-model orchestration, browser-native local models (WebLLM), and permission-tiered worker subagents.
146+
**REPLOID's niche:** Safe experimentation with self-modifying agents. Not the most powerful agent framework — the most observable and recoverable one.
147+
148+
---
149+
150+
## Why JavaScript?
151+
152+
REPLOID is pure JavaScript because the agent generates, modifies, and executes code at runtime — entirely in the browser. TypeScript requires compilation, but there's no build toolchain in-browser.
153+
154+
When the agent writes a new tool to the VFS, the Service Worker immediately serves it as an ES module. No compilation step, no latency. TypeScript would require bundling a 10MB+ compiler or maintaining separate source/artifact trees — defeating the self-modification model.
155+
156+
Runtime safety comes from verification (syntax checks, sandboxed execution, arena testing), not static types.
219157

220158
---
221159

@@ -246,24 +184,14 @@ npm run dev
246184

247185
REPLOID offers 3 genesis configurations (selectable at boot):
248186

249-
| Level | Modules | Description |
250-
|-------|---------|-------------|
251-
| **TABULA RASA** | 13 | Minimal agent core — fast boot, smallest surface |
252-
| **REFLECTION** | 19 | + Self-awareness, streaming, verification, HITL |
253-
| **FULL SUBSTRATE** | 32 | + Cognition, semantic memory, arena testing |
187+
| Level | Description |
188+
|-------|-------------|
189+
| **TABULA RASA** | Minimal agent core — fast boot, smallest surface |
190+
| **REFLECTION** | + Self-awareness, streaming, verification, HITL |
191+
| **FULL SUBSTRATE** | + Cognition, semantic memory, arena testing |
254192

255193
Select "FULL SUBSTRATE" for RSI experiments with maximum capability.
256194

257-
**Example Goals:**
258-
- "Create a recursive tool chain: a tool that builds tools that enhance tools"
259-
- "Analyze your source code in /core and identify bottlenecks"
260-
- "Build a tool that generates test cases from function signatures"
261-
262-
The VFS Explorer (right panel) provides:
263-
- **Preview (▶)** - Execute HTML/CSS/JS files in sandboxed iframe
264-
- **Diff (⊟)** - Compare current VFS to genesis state
265-
- **Snapshots (◷)** - Timeline of all saved states with restore capability
266-
267195
---
268196

269197
## License

0 commit comments

Comments
 (0)