Announcing Agentnetes: Self-Organizing AI Agent Swarms for Your Codebase

Kubernetes orchestrates containers. Agentnetes orchestrates AI agents.
Today we are open-sourcing Agentnetes under the SuperagenticAI organization. It is an orchestration layer for AI coding agents that treats a repository the way Kubernetes treats a cluster: as a place to schedule work, isolate execution, and recover from failure.
Kubernetes orchestrates containers. Agentnetes orchestrates AI agents.
You give Agentnetes a goal in plain English and point it at a git repository. It explores the codebase, plans the work, spins up a temporary team of specialists, runs them in parallel sandboxes, and then synthesizes the results into one outcome.
The important part is not just that multiple agents run. It is that they can inspect the repository directly, verify their own changes, and coordinate without forcing the entire codebase into a single prompt window.
The result is a more operational model for coding agents: less prompt choreography, more observable execution.
In this post, we will do four things:
- 1.Explain why Agentnetes treats codebases as environments to explore instead of prompts to stuff.
- 2.Show how the planning, parallel execution, and synthesis cycle works.
- 3.Walk through the sandbox model, CLI, web UI, and practical setup path.
- 4.Clarify where Agentnetes is a strong fit and where a simpler single-agent workflow is still better.
Emergent teams
Worker roles are generated from the task and the repo shape, not hardcoded ahead of time.
Isolated execution
Each worker gets its own sandbox, so experiments, tests, and failures stay contained.
Real-time visibility
The UI streams agent events live so you can see planning, execution, and synthesis as they happen.
The whole thing runs with a single command:
❯cd your-project❯GOOGLE_API_KEY=your_key npx agentnetes run "add comprehensive test coverage"
No install needed. If the repository already lives on git, Agentnetes can work against it.
The Origin Story
Agentnetes was built at Zero to Agent London 2026, a hackathon hosted by Google DeepMind and Vercel. The idea came from a simple observation: Kubernetes solved the problem of orchestrating containers at scale by introducing declarative goals, parallel execution, lifecycle management, and isolation. The same principles apply to orchestrating AI agents.
Most coding agents today still behave like a very capable single process. They can be useful, but they often bottleneck on context windows, execute work sequentially, and hide too much of their reasoning behind one opaque transcript. Agentnetes started as an attempt to make agentic software feel more like infrastructure: schedulable, inspectable, and resilient.
Just as you tell Kubernetes “run 3 replicas of this service” and it figures out placement, scheduling, and recovery — you tell Agentnetes “add dark mode to this app” and it figures out which specialists to spawn, how to divide the work, and how to bring everything together.
What Makes It Different
Agentnetes is not just a wrapper that spawns a few agents. It makes three strong bets about how coding agents should operate in real repositories.
Repositories stay in sandboxes, not prompts
Agentnetes treats a repo as an environment to explore with commands. That keeps context lean and lets agents inspect the latest state instead of relying on preloaded file dumps.
Specialists are assembled per goal
The planner decides which roles are useful for this repo and this task. You do not hand-author a fixed team up front.
Parallel execution is a first-class primitive
Workers run independently in separate sandboxes, so research, implementation, testing, and packaging can happen at the same time.
How It Works
Agentnetes follows a three-phase cycle: Plan, Execute, Synthesize.
Phase 1: Plan
A root agent, called the Tech Lead, receives your goal and explores the repository before deciding how to decompose the work. It then asks the planner to generate a task-specific team. A feature build might need a Scout, an Engineer, a Tester, and a Packager. A security audit might need a very different mix. The system does not assume one fixed workflow for every job.
Phase 2: Execute in Parallel
Each specialist agent gets its own isolated environment: a Docker container, a Vercel Firecracker microVM, an E2B sandbox, a Daytona workspace, or a local temp workspace. The repository is already inside that environment, so the agent can inspect files, run commands, and validate its own work. Workers run concurrently and only coordinate when the task requires it.
Every agent has exactly two tools:
search(pattern)to grep the codebase for patternsexecute(command)to run any shell command in the sandboxThat is the core interface. Two tools, minimal tool overhead, and a strong bias toward command-line exploration instead of prompt stuffing. This approach is inspired by the Repository-Level Machine (RLM) pattern from MIT CSAIL, which argues that agents perform better when they are given an environment to explore rather than a giant blob of pasted code.
Agents also follow the AutoResearch loop pattern popularized by Karpathy: write code, run tests, inspect failures, patch, repeat. They are expected to verify output, not just generate it.
Phase 3: Synthesize
When the specialists finish, the root agent reads their outputs and produces a structured synthesis: what changed, what passed, what failed, and what still needs attention.
Every phase emits typed events over Server-Sent Events. The web UI subscribes to those events and renders agent activity in real time, so the system is observable while it is working rather than only after it finishes.
The Architecture
explore repo · plan work · synthesize output
search codebase · map surfaces
implement changes in sandbox
run checks · patch failures
collect artifacts · summarize output
Agentnetes plans centrally, executes in parallel, and synthesizes results after each worker finishes inside its own sandbox.
The root agent uses Gemini 2.5 Pro by default for planning, while worker agents default to Gemini 2.5 Flash for speed. Both are configurable through the UI or environment variables, and the broader Gemini lineup is supported.
Fault tolerance works more like workload scheduling than a single pipeline. Agents run via Promise.allSettled, so one failed worker does not collapse the rest of the run. maxWorkers caps concurrency the same way resource limits cap work on a node.
Five Foundations
Agentnetes is built on five key ideas:
The RLM Pattern
Context lives in sandboxes, not prompts. Agents write shell commands to explore the codebase, keeping token usage tiny regardless of repo size.
The AutoResearch Loop
Agents write code, run tests, measure results, and loop. They verify their own output before declaring a task complete.
Two-Tool MCP Strategy
Each agent has exactly two tools: search and execute. Simple, composable, and powerful.
A2A Protocol
Every spawned agent generates a standard A2A Agent Card, making the system interoperable with other agent frameworks.
K8s-Inspired Load Balancing
Agents run concurrently with fault-tolerant parallel dispatch. One agent failing never blocks the others.
Why this matters: keeping the agent interface small makes orchestration simpler, more inspectable, and less sensitive to context-window bloat. The repo carries the complexity; the prompt does not have to.
Sandbox Providers
Agentnetes supports five sandbox providers so the same orchestration model can run locally, in cloud sandboxes, or in hosted microVM environments:
Docker
Requires: Docker running locally
One node:20-alpine container per agent. Isolated. Recommended for local development.
Vercel
Requires: VERCEL_TOKEN
Firecracker microVMs with snapshot support. Fastest option. Auto-detected when running on Vercel.
E2B
Requires: E2B_API_KEY
E2B cloud sandboxes.
Daytona
Requires: DAYTONA_API_KEY
Daytona workspaces.
Local
Requires: Nothing
Runs directly on your machine in a temp directory. No isolation. Good for quick experiments.
The system auto-detects providers in this order: Vercel, E2B, Daytona, Docker, Local. If you want a specific target, set SANDBOX_PROVIDER explicitly.
The CLI
Agentnetes ships as an npm package. There are really three common paths: run it once with npx, install it globally if you use it often, or start the local UI.
1. Run once with npx
❯GOOGLE_API_KEY=your_key npx agentnetes run "add comprehensive test coverage"
Best when you want to try Agentnetes against an existing repository without installing anything first.
2. Install globally for repeated use
❯npm install -g agentnetes❯GOOGLE_API_KEY=your_key agentnetes run "add dark mode"
Better if Agentnetes is becoming part of your normal local workflow.
3. Start the web UI
❯npx agentnetes serve❯# or choose a specific port❯npx agentnetes serve --port 8080
Use the UI when you want to submit goals interactively and watch planning and execution in real time.
Optional: pre-warm snapshots for faster sandbox startup
❯npx agentnetes snapshot create
This is mainly useful when running with the Vercel sandbox path and you want shorter cold starts.
In the simplest case, the CLI only needs a git repository and a Google API key. If you want stronger isolation or hosted execution, you can add a sandbox provider later.
The Web UI
Running npx agentnetes serve starts a local web interface where you can submit goals, watch the swarm work in real time, configure models and sandbox providers, and inspect generated artifacts.
The web UI has two modes:
Real mode
Agents run in actual sandboxes against a real codebase. Requires a Google API key and Docker (or another sandbox provider).
Simulation mode
Pre-scripted agent scenarios that demonstrate the system without any API keys or Docker. Great for trying it out.
If you want to see the interaction model before wiring up credentials, the live demo runs in simulation mode directly in the browser.
The Tech Stack
Stack overview
Getting Started
The fastest path (30 seconds)
If you just want to validate the core workflow, start with one concrete task in a repo you already understand.
❯cd your-project❯GOOGLE_API_KEY=your_key npx agentnetes run "add vitest coverage for src/utils and summarize gaps"
Running the web UI locally
❯git clone https://github.com/SuperagenticAI/agentnetes.git❯cd agentnetes❯npm install❯npm run dev
Set SIMULATION_MODE=true in .env.local to try it without an API key. Visit http://localhost:3000.
For real execution, set your environment variables explicitly:
❯SANDBOX_PROVIDER=docker❯SIMULATION_MODE=false❯GOOGLE_API_KEY=your_key_here
What Can You Use It For?
Here are some goals that work well:
The more specific your goal, the better the results. “Add vitest tests for all functions in src/utils/” works better than “add tests.”
Where It Fits Best
Agentnetes is strongest when the work benefits from decomposition and verification. It is not meant to replace every simpler agent workflow.
Good fit
Broad engineering tasks that naturally split into research, coding, testing, and synthesis
Large repositories where prompt-stuffing becomes slow, brittle, or expensive
Jobs where you want a trace of what each agent did, not just a final answer
Not the first tool I would reach for
Tiny one-file edits where a single coding agent is faster
Tasks that require privileged production access or irreversible actions
Workloads where the acceptance criteria are still vague or constantly changing
A New Home at SuperagenticAI
Agentnetes has moved from a personal repository to the SuperagenticAI organization on GitHub. The move makes the project easier to discover, maintain, and grow with community contributions.
The codebase, issues, pull requests, stars, and full git history have been preserved. Existing links to the old repository continue to redirect.
The new home for everything:
Contributing
Agentnetes is MIT licensed and open to contributions. Good contributions include new sandbox providers, planning improvements, stronger evaluation loops, better observability, and bug fixes from real-world runs.
❯git clone https://github.com/SuperagenticAI/agentnetes.git❯cd agentnetes❯npm install❯npm run dev
Check the contributing guide for details. Report issues at github.com/SuperagenticAI/agentnetes/issues.
Watch Demo
See the full Agentnetes loop in action: planning, worker fan-out, sandbox execution, and synthesized results streamed back to the UI.
Try It Now
Start with a bounded task that has an obvious success condition. The first run should teach you how the swarm behaves, not force it to solve your hardest problem immediately.
❯GOOGLE_API_KEY=your_key npx agentnetes run "add missing tests for src/lib/date.ts and summarize any uncovered edge cases"
Or try the live demo in your browser, no setup required.
Related Posts
CodexOpt: Optimize AGENTS.md and SKILL.md for Codex with GEPA-Inspired Feedback
A CLI for benchmarking and optimizing AGENTS.md and SKILL.md instruction assets.
Read PostA2A v1 in SuperOptiX: Expose, Connect, and Orchestrate AI Agents
SuperOptiX now includes first-class A2A v1 support as a native protocol capability.
Read Post