GitHub - polos-dev/polos: AI agents write code, run commands, and delete files - autonomously. Polos gives them isolated sandboxes with built-in tools (shell, file system, web search), approval flows that reach you on various channels, and durable execution with automatic retries, prompt caching, and concurrency control. Agents get full power. Your systems stay safe.

The runtime for agents that do real work

Sandboxed execution. Agents that reach you. Durable workflows.

⭐ Star us to support the project!

AI agents break the rules of traditional software. They're async by nature - running while you sleep, but still needing you to approve, confirm, or provide a credential. They're autonomous by design - you say "fix the bug" and they write code, run commands, delete files. That power is the point. And the risk.

Most frameworks ignore this. Polos is built for it.

100% open source. Write it all in plain Python or TypeScript. No DAGs to define, no graph syntax to learn.

import { defineAgent, sandboxTools } from "@polos/sdk";
import { anthropic } from "@ai-sdk/anthropic";

// Create a sandboxed environment — agents get exec, read, write,
// edit, glob, and grep tools automatically.
const sandbox = sandboxTools({
  env: 'docker',
  docker: {
    image: 'node:20-slim',
    workspaceDir: './workspace',
    memory: '2g',
  },
});

// Give the agent sandbox tools — it can now run commands,
// read/write files, and explore the codebase autonomously.
const codingAgent = defineAgent({
  id: 'coding_agent',
  model: anthropic('claude-opus-4-5'),
  systemPrompt: 'You are a coding assistant. The repo is at /workspace.',
  tools: [...sandbox], // exec, read, write, edit, glob, grep
});

What You Get With Polos

Secure Sandbox

Agents run in isolated environments - Docker, E2B, or cloud VMs. Built-in tools for shell, file system, and web search. Full power. Zero risk to your systems.

Agents That Reach You

Agents reach you - not the other way around. Stripe-like approval pages that collect input, not just yes/no. Slack, SMS, email - you're at dinner, phone buzzes, one tap, done.

Durable Execution

State persists - agents resume exactly where they left off. Automatic retries on failure. 60-80% cost savings via prompt caching. Built-in observability for every step, every approval, every tool call. Concurrency control across multiple agents - no API rate limit chaos.

See It In Action

Watch a coding agent built with Polos - sandboxed execution, tool calls, and real-time observability.

Watch the demo video

Quick Start

1. Install Polos Server

curl -fsSL https://install.polos.dev/install.sh | bash
polos-server start

Copy the project ID displayed when you start the server. You'll need it in the next steps.

2. Install the SDK

Python

pip install polos-sdk

TypeScript

npm install @polos/sdk

3. Create a coding agent

Python

# agents.py
from polos import Agent, sandbox_tools, SandboxToolsConfig, LocalEnvironmentConfig

sandbox_tools = sandbox_tools(SandboxToolsConfig(
    env="local",
    local=LocalEnvironmentConfig(cwd="./workspace", path_restriction="./workspace"),
))

coding_agent = Agent(
    id="coding_agent",
    provider="anthropic",
    model="claude-sonnet-4-5",
    system_prompt="You are a coding agent. Your workspace is at ./workspace.",
    tools=sandbox_tools,
)

# worker.py
from polos import PolosClient, Worker
from agents import coding_agent, sandbox_tools

client = PolosClient(project_id="your-project-id")
worker = Worker(client=client, agents=[coding_agent], tools=list(sandbox_tools))

if __name__ == "__main__":
    import asyncio
    asyncio.run(worker.run())

TypeScript

// agents.ts
import { defineAgent, sandboxTools } from "@polos/sdk";
import { anthropic } from "@ai-sdk/anthropic";

export const sandboxTools = sandboxTools({
  env: "local",
  local: { cwd: "./workspace", pathRestriction: "./workspace" },
});

export const codingAgent = defineAgent({
  id: "coding_agent",
  model: anthropic("claude-sonnet-4-5"),
  systemPrompt: "You are a coding agent. Your workspace is at ./workspace.",
  tools: [...sandboxTools],
});

// worker.ts
import { PolosClient, Worker } from "@polos/sdk";
import { codingAgent, sandboxTools } from "./agents.js";

const client = new PolosClient({ projectId: "your-project-id" });
const worker = new Worker({ client, agents: [codingAgent], tools: [...sandboxTools] });

await worker.run();

4. Invoke the agent

Local sandbox tools suspend for approval before running commands or writing files. The client streams workflow events, prompts you in the terminal, and resumes the agent.

Python

# main.py
import asyncio
from polos import PolosClient
from polos.features import events
from agents import coding_agent

async def main():
    client = PolosClient(project_id="your-project-id")
    handle = await client.invoke(coding_agent.id, {
        "input": "Create hello.js that prints 'Hello, world!' and run it.",
        "streaming": True,
    })

    # Stream events — approve each exec/write/edit when the agent suspends
    async for event in events.stream_workflow(client, handle.root_workflow_id, handle.id):
        if event.event_type and event.event_type.startswith("suspend_"):
            step_key = event.event_type[len("suspend_"):]
            form = event.data.get("_form", {})
            context = form.get("context", {})
            print(f"\n  Agent wants to: {context.get('command') or context.get('tool', step_key)}")
            approved = input("  Approve? (y/n): ").strip().lower() == "y"
            await client.resume(handle.root_workflow_id, handle.id, step_key, {"approved": approved})

    execution = await client.get_execution(handle.id)
    print(f"\nResult: {execution.get('result')}")

asyncio.run(main())

TypeScript

// main.ts
import { PolosClient } from "@polos/sdk";
import { codingAgent } from "./agents.js";
import * as readline from "node:readline/promises";

const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const client = new PolosClient({ projectId: "your-project-id" });
const handle = await client.invoke(codingAgent.id, {
  input: "Create hello.js that prints 'Hello, world!' and run it.",
  streaming: true,
});

// Stream events — approve each exec/write/edit when the agent suspends
for await (const event of client.events.streamWorkflow(handle.rootWorkflowId, handle.id)) {
  if (event.eventType?.startsWith("suspend_")) {
    const stepKey = event.eventType.slice("suspend_".length);
    const context = (event.data as any)?._form?.context ?? {};
    console.log(`\n  Agent wants to: ${context.command ?? context.tool ?? stepKey}`);
    const answer = await rl.question("  Approve? (y/n): ");
    const approved = answer.trim().toLowerCase() === "y";
    await client.resume(handle.rootWorkflowId, handle.id, stepKey, { approved });
  }
}

const execution = await client.getExecution(handle.id);
console.log(`\nResult: ${typeof execution.result === "string" ? execution.result : JSON.stringify(execution.result)}`);
rl.close();

5. Run it

# Terminal 1: Start the worker
python worker.py    # or: npx tsx worker.ts

# Terminal 2: Invoke the agent
python main.py      # or: npx tsx main.ts

See the full example for Python or TypeScript with richer approval UIs and more.

6. See it in action

Open the Polos UI to see your agent's execution trace, tool calls, and reasoning:

📖 Full Quick Start Guide →

Architecture

Polos consists of three components:

Orchestrator: Written in Rust. Manages execution state, handles retries, and coordinates workers
Worker: Runs your agents and workflows, connects to the orchestrator
SDK: Python and TypeScript libraries for defining agents, workflows, and tools

Why Polos?

Feature	Description
🔒 Sandboxed Execution	Agents run in isolated Docker containers, E2B, or cloud VMs. Built-in tools for shell, files, and web search - full autonomy with zero risk.
📲 Agents That Reach You	Approval pages, Slack, SMS, email. Agents notify you when they need input. One tap from your phone. Done.
🧠 Durable State	Your agent survives crashes with call stack and local variables intact. Step 18 of 20 fails? Resume from step 18. No wasted LLM calls.
🚦 Global Concurrency	System-wide rate limiting with queues and concurrency keys. Prevent one rogue agent from exhausting your entire OpenAI quota.
🤝 Human-in-the-Loop	Native support for pausing execution. Wait hours or days for user approval and resume with full context. Paused agents consume zero compute.
📡 Agent Handoffs	Transactional memory for multi-agent systems. Pass reasoning history between specialized agents without context drift.
🔍 Decision-Level Observability	Trace the reasoning behind every tool call, not just raw logs. See why your agent chose Tool B over Tool A.
⚡ Production Ready	Automatic retries, exactly-once execution guarantees, OpenTelemetry tracing built-in.

Logic Belongs in Code, Not Configs

With Polos:

Python

@workflow
async def process_order(ctx: WorkflowContext, order: ProcessOrderInput):
    if order.amount > 1000:
        approved = await ctx.step.suspend("approval", data=order.model_dump())
        if not approved.data["ok"]:
            return {"status": "rejected"}

    await ctx.step.run("charge", charge_stripe, order)
    await ctx.step.run("notify", send_email, order)

TypeScript

const processOrder = defineWorkflow({ id: "process-order" }, async (ctx, order) => {
  if (order.amount > 1000) {
    const approved = await ctx.step.suspend("approval", { data: order });
    if (!approved.data.ok) {
      return { status: "rejected" };
    }
  }

  await ctx.step.run("charge", () => chargeStripe(order));
  await ctx.step.run("notify", () => sendEmail(order));
});

Other platforms:

dag = DAG(
    nodes=[
        Node("check_amount", CheckAmount),
        Node("approval", HumanApproval),
        Node("charge", ChargeStripe),
        Node("notify", SendEmail),
    ],
    edges=[
        ("check_amount", "approval", condition="amount > 1000"),
        ("check_amount", "charge", condition="amount <= 1000"),
        ("approval", "charge", condition="approved"),
        ("charge", "notify"),
    ]
)

No DAGs. No graph syntax. Just Python or TypeScript.

Examples

Agents

Example	Python	TypeScript	Description
Agent with tools	Python	TypeScript	Simple agent with tool calling
Structured Output	Python	TypeScript	Agent with structured model responses
Streaming	Python	TypeScript	Real-time streaming responses
Conversational Chat	Python	TypeScript	Multi-turn conversations with memory
Thinking Agent	Python	TypeScript	Chain-of-thought reasoning
Guardrails	Python	TypeScript	Input/output validation
Multi-Agent Coordination	Python	TypeScript	Workflow orchestrating multiple agents
Order Processing	Python	TypeScript	Human-in-the-loop fraud review
Sandbox Tools	Python	TypeScript	Code execution in an isolated Docker container
Exec Security	Python	TypeScript	Allowlist-based command approval
Web Search Agent	Python	TypeScript	Research agent with Tavily web search
Local Sandbox	Python	TypeScript	Sandbox tools running on the host machine

Workflows

Example	Python	TypeScript	Description
Workflow Basics	Python	TypeScript	Core workflow patterns
Suspend/Resume	Python	TypeScript	Human-in-the-loop approvals
State Persistence	Python	TypeScript	Durable state across executions
Error Handling	Python	TypeScript	Retry, fallback, compensation patterns
Queues & Concurrency	Python	TypeScript	Rate limiting and concurrency control
Parallel Execution	Python	TypeScript	Fan-out/fan-in patterns

Events & Scheduling

Example	Python	TypeScript	Description
Event-Triggered	Python	TypeScript	Pub/sub event-driven workflows
Scheduled Workflows	Python	TypeScript	Cron-based scheduling

Human-in-the-Loop

Example	Python	TypeScript	Description
Approval Page	Python	TypeScript	Web UI for workflow approval with suspend/resume

Under the Hood

Polos captures the result of every side effect - tool calls, API responses, time delays as a durable log. If your process dies, Polos replays the workflow from the log, returning previously-recorded results instead of re-executing them. Your agent's exact local variables and call stack are restored in milliseconds.

Completed steps are never re-executed - so you never pay for an LLM call twice.

Documentation

For detailed documentation, visit polos.dev/docs

Community

Join our community to get help, share ideas, and stay updated:

Contributing

We welcome contributions! Whether it's bug reports, feature requests, documentation improvements, or code contributions.

License

Polos is Apache 2.0 licensed.

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
.github		.github
docs		docs
orchestrator		orchestrator
python-examples		python-examples
scripts		scripts
sdk		sdk
server		server
typescript-examples		typescript-examples
ui		ui
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What You Get With Polos

Secure Sandbox

Agents That Reach You

Durable Execution

See It In Action

Quick Start

1. Install Polos Server

2. Install the SDK

3. Create a coding agent

4. Invoke the agent

5. Run it

6. See it in action

Architecture

Why Polos?

Logic Belongs in Code, Not Configs

Examples

Agents

Workflows

Events & Scheduling

Human-in-the-Loop

Under the Hood

Documentation

Community

Contributing

License

About

Uh oh!

Releases 17

Packages

Languages

License

polos-dev/polos

Folders and files

Latest commit

History

Repository files navigation

What You Get With Polos

Secure Sandbox

Agents That Reach You

Durable Execution

See It In Action

Quick Start

1. Install Polos Server

2. Install the SDK

3. Create a coding agent

4. Invoke the agent

5. Run it

6. See it in action

Architecture

Why Polos?

Logic Belongs in Code, Not Configs

Examples

Agents

Workflows

Events & Scheduling

Human-in-the-Loop

Under the Hood

Documentation

Community

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 17

Packages 0

Languages

Packages