Agentnetes
Open Source
Agent Swarms

Announcing Agentnetes: Self-Organizing AI Agent Swarms for Your Codebase

March 24, 2026
15 min read
By Shashi Jagtap
Announcing Agentnetes: Self-Organizing AI Agent Swarms for Your Codebase

Kubernetes orchestrates containers. Agentnetes orchestrates AI agents.

Today we are open-sourcing Agentnetes under the SuperagenticAI organization. It is an orchestration layer for AI coding agents that treats a repository the way Kubernetes treats a cluster: as a place to schedule work, isolate execution, and recover from failure.

Kubernetes orchestrates containers. Agentnetes orchestrates AI agents.

You give Agentnetes a goal in plain English and point it at a git repository. It explores the codebase, plans the work, spins up a temporary team of specialists, runs them in parallel sandboxes, and then synthesizes the results into one outcome.

The important part is not just that multiple agents run. It is that they can inspect the repository directly, verify their own changes, and coordinate without forcing the entire codebase into a single prompt window.

The result is a more operational model for coding agents: less prompt choreography, more observable execution.

In this post, we will do four things:

  1. 1.Explain why Agentnetes treats codebases as environments to explore instead of prompts to stuff.
  2. 2.Show how the planning, parallel execution, and synthesis cycle works.
  3. 3.Walk through the sandbox model, CLI, web UI, and practical setup path.
  4. 4.Clarify where Agentnetes is a strong fit and where a simpler single-agent workflow is still better.

Quick links: docs, GitHub, live demo, npm

Emergent teams

Worker roles are generated from the task and the repo shape, not hardcoded ahead of time.

Isolated execution

Each worker gets its own sandbox, so experiments, tests, and failures stay contained.

Real-time visibility

The UI streams agent events live so you can see planning, execution, and synthesis as they happen.

The whole thing runs with a single command:

bash
cd your-project
GOOGLE_API_KEY=your_key npx agentnetes run "add comprehensive test coverage"

No install needed. If the repository already lives on git, Agentnetes can work against it.

The Origin Story

Agentnetes was built at Zero to Agent London 2026, a hackathon hosted by Google DeepMind and Vercel. The idea came from a simple observation: Kubernetes solved the problem of orchestrating containers at scale by introducing declarative goals, parallel execution, lifecycle management, and isolation. The same principles apply to orchestrating AI agents.

Most coding agents today still behave like a very capable single process. They can be useful, but they often bottleneck on context windows, execute work sequentially, and hide too much of their reasoning behind one opaque transcript. Agentnetes started as an attempt to make agentic software feel more like infrastructure: schedulable, inspectable, and resilient.

Just as you tell Kubernetes “run 3 replicas of this service” and it figures out placement, scheduling, and recovery — you tell Agentnetes “add dark mode to this app” and it figures out which specialists to spawn, how to divide the work, and how to bring everything together.

What Makes It Different

Agentnetes is not just a wrapper that spawns a few agents. It makes three strong bets about how coding agents should operate in real repositories.

Repositories stay in sandboxes, not prompts

Agentnetes treats a repo as an environment to explore with commands. That keeps context lean and lets agents inspect the latest state instead of relying on preloaded file dumps.

Specialists are assembled per goal

The planner decides which roles are useful for this repo and this task. You do not hand-author a fixed team up front.

Parallel execution is a first-class primitive

Workers run independently in separate sandboxes, so research, implementation, testing, and packaging can happen at the same time.

How It Works

Agentnetes follows a three-phase cycle: Plan, Execute, Synthesize.

Phase 1: Plan

A root agent, called the Tech Lead, receives your goal and explores the repository before deciding how to decompose the work. It then asks the planner to generate a task-specific team. A feature build might need a Scout, an Engineer, a Tester, and a Packager. A security audit might need a very different mix. The system does not assume one fixed workflow for every job.

Phase 2: Execute in Parallel

Each specialist agent gets its own isolated environment: a Docker container, a Vercel Firecracker microVM, an E2B sandbox, a Daytona workspace, or a local temp workspace. The repository is already inside that environment, so the agent can inspect files, run commands, and validate its own work. Workers run concurrently and only coordinate when the task requires it.

Every agent has exactly two tools:

search(pattern)to grep the codebase for patterns
execute(command)to run any shell command in the sandbox

That is the core interface. Two tools, minimal tool overhead, and a strong bias toward command-line exploration instead of prompt stuffing. This approach is inspired by the Repository-Level Machine (RLM) pattern from MIT CSAIL, which argues that agents perform better when they are given an environment to explore rather than a giant blob of pasted code.

Agents also follow the AutoResearch loop pattern popularized by Karpathy: write code, run tests, inspect failures, patch, repeat. They are expected to verify output, not just generate it.

Phase 3: Synthesize

When the specialists finish, the root agent reads their outputs and produces a structured synthesis: what changed, what passed, what failed, and what still needs attention.

Every phase emits typed events over Server-Sent Events. The web UI subscribes to those events and renders agent activity in real time, so the system is observable while it is working rather than only after it finishes.

The Architecture

Your Goal
Root Agent / Tech Lead

explore repo · plan work · synthesize output

Planner and Runtime
Scout

search codebase · map surfaces

Engineer

implement changes in sandbox

Tester

run checks · patch failures

Packager

collect artifacts · summarize output

Isolated Sandboxes

Agentnetes plans centrally, executes in parallel, and synthesizes results after each worker finishes inside its own sandbox.

The root agent uses Gemini 2.5 Pro by default for planning, while worker agents default to Gemini 2.5 Flash for speed. Both are configurable through the UI or environment variables, and the broader Gemini lineup is supported.

Fault tolerance works more like workload scheduling than a single pipeline. Agents run via Promise.allSettled, so one failed worker does not collapse the rest of the run. maxWorkers caps concurrency the same way resource limits cap work on a node.

Five Foundations

Agentnetes is built on five key ideas:

The RLM Pattern

Context lives in sandboxes, not prompts. Agents write shell commands to explore the codebase, keeping token usage tiny regardless of repo size.

The AutoResearch Loop

Agents write code, run tests, measure results, and loop. They verify their own output before declaring a task complete.

Two-Tool MCP Strategy

Each agent has exactly two tools: search and execute. Simple, composable, and powerful.

A2A Protocol

Every spawned agent generates a standard A2A Agent Card, making the system interoperable with other agent frameworks.

K8s-Inspired Load Balancing

Agents run concurrently with fault-tolerant parallel dispatch. One agent failing never blocks the others.

Why this matters: keeping the agent interface small makes orchestration simpler, more inspectable, and less sensitive to context-window bloat. The repo carries the complexity; the prompt does not have to.

Sandbox Providers

Agentnetes supports five sandbox providers so the same orchestration model can run locally, in cloud sandboxes, or in hosted microVM environments:

Docker

default

Requires: Docker running locally

One node:20-alpine container per agent. Isolated. Recommended for local development.

Vercel

Requires: VERCEL_TOKEN

Firecracker microVMs with snapshot support. Fastest option. Auto-detected when running on Vercel.

E2B

Requires: E2B_API_KEY

E2B cloud sandboxes.

Daytona

Requires: DAYTONA_API_KEY

Daytona workspaces.

Local

Requires: Nothing

Runs directly on your machine in a temp directory. No isolation. Good for quick experiments.

The system auto-detects providers in this order: Vercel, E2B, Daytona, Docker, Local. If you want a specific target, set SANDBOX_PROVIDER explicitly.

The CLI

Agentnetes ships as an npm package. There are really three common paths: run it once with npx, install it globally if you use it often, or start the local UI.

1. Run once with npx

bash
GOOGLE_API_KEY=your_key npx agentnetes run "add comprehensive test coverage"

Best when you want to try Agentnetes against an existing repository without installing anything first.

2. Install globally for repeated use

bash
npm install -g agentnetes
GOOGLE_API_KEY=your_key agentnetes run "add dark mode"

Better if Agentnetes is becoming part of your normal local workflow.

3. Start the web UI

bash
npx agentnetes serve
# or choose a specific port
npx agentnetes serve --port 8080

Use the UI when you want to submit goals interactively and watch planning and execution in real time.

Optional: pre-warm snapshots for faster sandbox startup

bash
npx agentnetes snapshot create

This is mainly useful when running with the Vercel sandbox path and you want shorter cold starts.

In the simplest case, the CLI only needs a git repository and a Google API key. If you want stronger isolation or hosted execution, you can add a sandbox provider later.

The Web UI

Running npx agentnetes serve starts a local web interface where you can submit goals, watch the swarm work in real time, configure models and sandbox providers, and inspect generated artifacts.

The web UI has two modes:

Real mode

Agents run in actual sandboxes against a real codebase. Requires a Google API key and Docker (or another sandbox provider).

Simulation mode

Pre-scripted agent scenarios that demonstrate the system without any API keys or Docker. Great for trying it out.

If you want to see the interaction model before wiring up credentials, the live demo runs in simulation mode directly in the browser.

The Tech Stack

Stack overview

AI RuntimeVercel AI SDK v7 beta
Agent PrimitiveToolLoopAgent from AI SDK
LLM ProviderGoogle Gemini (2.0, 2.5, 3.x)
Sandbox (local)Docker node:20-alpine
Sandbox (cloud)Vercel Firecracker microVMs
Web FrameworkNext.js App Router
StreamingServer-Sent Events via native ReadableStream
LicenseMIT

Getting Started

The fastest path (30 seconds)

If you just want to validate the core workflow, start with one concrete task in a repo you already understand.

1
Get a free API key at aistudio.google.com
2
Pull the Docker base image: docker pull node:20-alpine
3
Run on any git repo
bash
cd your-project
GOOGLE_API_KEY=your_key npx agentnetes run "add vitest coverage for src/utils and summarize gaps"

Running the web UI locally

bash
git clone https://github.com/SuperagenticAI/agentnetes.git
cd agentnetes
npm install
npm run dev

Set SIMULATION_MODE=true in .env.local to try it without an API key. Visit http://localhost:3000.

For real execution, set your environment variables explicitly:

env
SANDBOX_PROVIDER=docker
SIMULATION_MODE=false
GOOGLE_API_KEY=your_key_here

What Can You Use It For?

Here are some goals that work well:

Add comprehensive test coverage for all utility functions
Implement dark mode across the entire application
Run a security audit and report all vulnerabilities
Refactor the authentication module to use JWT tokens
Add TypeScript types to all untyped JavaScript files
Set up CI/CD with GitHub Actions

The more specific your goal, the better the results. “Add vitest tests for all functions in src/utils/” works better than “add tests.”

Where It Fits Best

Agentnetes is strongest when the work benefits from decomposition and verification. It is not meant to replace every simpler agent workflow.

Good fit

Broad engineering tasks that naturally split into research, coding, testing, and synthesis

Large repositories where prompt-stuffing becomes slow, brittle, or expensive

Jobs where you want a trace of what each agent did, not just a final answer

Not the first tool I would reach for

Tiny one-file edits where a single coding agent is faster

Tasks that require privileged production access or irreversible actions

Workloads where the acceptance criteria are still vague or constantly changing

A New Home at SuperagenticAI

Agentnetes has moved from a personal repository to the SuperagenticAI organization on GitHub. The move makes the project easier to discover, maintain, and grow with community contributions.

The codebase, issues, pull requests, stars, and full git history have been preserved. Existing links to the old repository continue to redirect.

The new home for everything:

Contributing

Agentnetes is MIT licensed and open to contributions. Good contributions include new sandbox providers, planning improvements, stronger evaluation loops, better observability, and bug fixes from real-world runs.

bash
git clone https://github.com/SuperagenticAI/agentnetes.git
cd agentnetes
npm install
npm run dev

Check the contributing guide for details. Report issues at github.com/SuperagenticAI/agentnetes/issues.

Watch Demo

See the full Agentnetes loop in action: planning, worker fan-out, sandbox execution, and synthesized results streamed back to the UI.

Try It Now

Start with a bounded task that has an obvious success condition. The first run should teach you how the swarm behaves, not force it to solve your hardest problem immediately.

bash
GOOGLE_API_KEY=your_key npx agentnetes run "add missing tests for src/lib/date.ts and summarize any uncovered edge cases"

Or try the live demo in your browser, no setup required.