You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Remove specific file path references while keeping technical architecture
and concepts. Focus on how the system works rather than implementation details.
@@ -9,15 +9,13 @@ A containment environment for AI agents that can write and execute code. Built f
9
9
| Mode | How to Run | Notes |
10
10
|------|------------|-------|
11
11
|**Fully local**|`npm install && npm run dev`| No network needed after hydration. Uses local models (Ollama, WebLLM). |
12
-
|**Hosted UI (https://replo.id)**| Open the site directly | Use WebLLM, cloud API keys, or point the UI at your **local** proxy by enabling CORS on that proxy. |
13
-
|**Hybrid**| Hosted UI + local proxy | Run `npm start` locally, set `CORS_ORIGINS="https://replo.id,https://your-domain.example"` (or configure `server.corsOrigins`), then connect the hosted UI to `http://127.0.0.1:8000`. |
12
+
|**Hosted UI (https://replo.id)**| Open the site directly | Use WebLLM, cloud API keys, or connect to your local proxy. |
13
+
|**Hybrid**| Hosted UI + local proxy | Run `npm start` locally with CORS configured, then connect the hosted UI. |
14
14
15
-
In all modes, the agent still runs in your browser. The proxy is only needed if you want to access local model servers (e.g., Ollama) from the hosted UI or route cloud API calls through your own machine.
15
+
In all modes, the agent runs in your browser. The proxy is only needed to access local model servers from the hosted UI or route cloud API calls through your machine.
16
16
17
17
---
18
18
19
-
See [TODO.md](TODO.md) for roadmap | [AGENTS.md](AGENTS.md) for agent profile
20
-
21
19
## Why REPLOID?
22
20
23
21
AI agents that write code are powerful but dangerous. Most frameworks give agents unrestricted filesystem access, shell execution, or Docker root — then hope nothing goes wrong.
@@ -30,10 +28,6 @@ REPLOID takes a different approach: **everything runs in a browser sandbox** wit
30
28
-**Self-modification gating** — Test proposed code changes before committing them
31
29
-**Alignment prototyping** — Experiment with oversight patterns before deploying to production
32
30
33
-
## How It Works
34
-
35
-
The agent operates on a Virtual File System (VFS) backed by IndexedDB. It can read, write, and execute code — but only within the sandbox. All mutations pass through a verification layer that checks for syntax errors, dangerous patterns, and policy violations.
36
-
37
31
## Architecture
38
32
39
33
```mermaid
@@ -55,106 +49,64 @@ graph TD
55
49
end
56
50
```
57
51
58
-
### Safety First
59
-
60
-
1.**Genesis Snapshot at Boot**: Full VFS snapshot captured immediately after hydration, before any user action. Enables offline rollback to pristine state—no network required for recovery.
52
+
### How It Works
61
53
62
-
2.**Verification Manager**: All code changes pass through pre-flight checks in an isolated Web Worker. Catches syntax errors, infinite loops, `eval()`, and other dangerous patterns before they reach the VFS.
54
+
The agent operates on a **Virtual File System (VFS)** backed by IndexedDB. It can read, write, and execute code — but only within the sandbox. All mutations pass through a verification layer that checks for syntax errors, dangerous patterns, and policy violations.
63
55
64
-
3.**VFS Snapshots**: Transactional rollback. Capture state before mutations, restore if verification fails. No permanent damage from bad agent decisions.
56
+
**Core execution loop:**
57
+
1.**Think** — Agent analyzes context and decides next action
58
+
2.**Act** — Tool call executed against VFS
59
+
3.**Observe** — Results captured and fed back to agent
65
60
66
-
4.**Arena Mode**: Test-driven selection for self-modifications. Multiple candidates compete, only verified solutions win. Located in `/testing/arena/`.
-**Worker Manager** — Multi-worker orchestration with permission tiers
66
+
-**Tool Runner** — Dynamic tool loading with arena gating for self-modifications
67
+
-**Verification Manager** — Pre-flight safety checks in isolated Web Worker
67
68
68
-
5.**Circuit Breakers**: Rate limiting and iteration caps (default: 50 cycles) prevent runaway agents. Automatic recovery on failure.
69
+
### Safety Mechanisms
69
70
70
-
6.**Audit Logging**: Every tool call, VFS mutation, and agent decision is logged. Full replay capability for debugging and analysis.
71
+
1.**Genesis Snapshot** — Full VFS snapshot captured at boot, before any user action. Enables offline rollback to pristine state.
71
72
72
-
7.**Service Worker Module Loader**: All ES6 imports intercepted and served from VFS (IndexedDB). Once hydrated, the agent runs entirely offline. Entry points (`boot.js`, `index.html`) stay on network for clean genesis boundaries.
73
+
2.**Pre-flight Verification** — All code changes pass through isolated Web Worker. Catches syntax errors, infinite loops, `eval()`, and dangerous patterns before reaching VFS.
73
74
74
-
8.**Genesis Diff Visualization**: Color-coded comparison showing all changes from initial state (green = added, yellow = modified, red = deleted). Instant visibility into what the agent has modified.
75
+
3.**Transactional Rollback** — VFS snapshots before mutations, restores on verification failure. No permanent damage from bad agent decisions.
75
76
76
-
### Core Components
77
+
4.**Arena Mode** — Test-driven selection for self-modifications. Multiple candidates compete, only verified solutions win.
|**analyze**| Read + JSON tools | Data processing |
111
95
|**execute**| Full tool access | Task execution |
112
96
113
-
**Model Roles:** Each worker can use a different model role (orchestrator, fast, code, local) for cost optimization.
114
-
115
-
**Worker Tools:**
116
-
-`SpawnWorker` — Create a new worker with type, task, and optional model role
117
-
-`ListWorkers` — View active and completed workers
118
-
-`AwaitWorkers` — Wait for specific workers or all to complete
119
-
120
-
Workers run in a flat hierarchy (no worker can spawn workers) and all actions flow through the same audit pipeline.
121
-
122
-
### Available Tools
97
+
Each worker can use a different model role (orchestrator, fast, code, local) for cost optimization. Workers run in a flat hierarchy (no nested spawning) and all actions flow through the audit pipeline.
123
98
124
-
**All tools are dynamic** — loaded from `/tools/` at boot. No hardcoded tools means full RSI capability: the agent can modify any tool, including core file operations. All tool names use CamelCase (e.g., ReadFile, Grep, CreateTool) to keep the interface consistent.
99
+
### Tool System
125
100
126
-
**Core VFS Operations:**
127
-
-`ReadFile`, `WriteFile`, `ListFiles`, `DeleteFile` — VFS operations with audit logging
101
+
All tools are **dynamically loaded** at boot. No hardcoded tools means full RSI capability: the agent can modify any tool, including core file operations.
128
102
129
-
**Meta-Tools (RSI):**
130
-
-`CreateTool` — Dynamic tool creation at runtime (L1 RSI)
131
-
-`LoadModule` — Hot-reload modules from VFS
132
-
-`ListTools` — Discover available tools
133
-
-`Edit` — Apply literal match/replacement edits to files
-`Git` — Version control operations (VFS-scoped shim)
145
-
-`Mkdir`, `Rm`, `Mv`, `Cp` — File management
146
-
147
-
All tools operate within the VFS sandbox with no access to host filesystem. Tools receive a `deps` object with VFS, EventBus, ToolWriter, WorkerManager, and other modules for full capability.
148
-
149
-
---
150
-
151
-
## Why JavaScript, Not TypeScript?
152
-
153
-
REPLOID is pure JavaScript because the agent generates, modifies, and executes code at runtime—entirely in the browser. TypeScript requires compilation, but there's no Node.js or build toolchain in-browser.
154
-
155
-
When the agent writes a new tool to the VFS, the Service Worker immediately serves it as an ES module. No compilation step, no latency. TypeScript would require bundling a 10MB+ compiler or maintaining separate source/artifact trees—defeating the self-modification model.
156
-
157
-
Runtime safety comes from verification (syntax checks, sandboxed execution, arena testing), not static types. The `[SW]` logs show this: modules loading from VFS, no build step.
109
+
All tools operate within the VFS sandbox with no access to host filesystem.
158
110
159
111
---
160
112
@@ -170,34 +122,11 @@ REPLOID is designed to study [recursive self-improvement](https://en.wikipedia.o
**L1 Example:** Agent creates an "AddNumbers" tool, writes it to VFS, tests it, confirms it works.
191
126
192
-
### Example: Meta-Tool Creation (L2)
193
-
**Goal:** "Build a system that creates tools from descriptions"
127
+
**L2 Example:** Agent creates a "CreateToolFromDescription" tool that uses the LLM to generate code, then persists via the tool-creation mechanism. A tool that makes tools.
194
128
195
-
Agent creates `CreateToolFromDescription` which calls the LLM to generate code, then calls `CreateTool` to persist it. A tool that makes tools.
196
-
197
-
### Example: Substrate Modification (L3)
198
-
**Goal:** "Optimize your tool creation process"
199
-
200
-
Agent reads `/core/tool-writer.js`, identifies a bottleneck, writes an improved version with `WriteFile`, and hot-reloads via `LoadModule`. Self-modification of core infrastructure.
129
+
**L3 Example:** Agent reads its own core modules, identifies a bottleneck, writes an improved version, and hot-reloads it. Self-modification of core infrastructure.
201
130
202
131
---
203
132
@@ -213,9 +142,18 @@ Agent reads `/core/tool-writer.js`, identifies a bottleneck, writes an improved
|**Inspectable**| Full source | Full source | Partial | Closed |
217
145
218
-
**REPLOID's niche:** Safe experimentation with self-modifying agents. Not the most powerful agent framework — the most observable and recoverable one. Unique advantages: multi-model orchestration, browser-native local models (WebLLM), and permission-tiered worker subagents.
146
+
**REPLOID's niche:** Safe experimentation with self-modifying agents. Not the most powerful agent framework — the most observable and recoverable one.
147
+
148
+
---
149
+
150
+
## Why JavaScript?
151
+
152
+
REPLOID is pure JavaScript because the agent generates, modifies, and executes code at runtime — entirely in the browser. TypeScript requires compilation, but there's no build toolchain in-browser.
153
+
154
+
When the agent writes a new tool to the VFS, the Service Worker immediately serves it as an ES module. No compilation step, no latency. TypeScript would require bundling a 10MB+ compiler or maintaining separate source/artifact trees — defeating the self-modification model.
155
+
156
+
Runtime safety comes from verification (syntax checks, sandboxed execution, arena testing), not static types.
219
157
220
158
---
221
159
@@ -246,24 +184,14 @@ npm run dev
246
184
247
185
REPLOID offers 3 genesis configurations (selectable at boot):
0 commit comments