The stateful control plane for AI agents. Track, secure, and optimize every agent run with absolute visibility and control — not just individual LLM calls.

Works with multiple LLM SDKs (Anthropic, Gemini, OpenAI, Cohere, Mistral, LiteLLM, etc.) and agent CLIs like Claude Code, Gemini CLI, and OpenClaw.

Why StateLoom?

Standard AI gateways operate at the LLM call level — they see each API request in isolation. But modern AI agents run in loops, have specific sessions for individual tasks, make 50+ background calls, frequently crash, and what not.

StateLoom is session-aware and built explicitly for agents. We group fragmented workflows into meaningful, stateful sessions. This allows you to resume crashed scripts without paying to repeat previously successful steps, enforce strict budgets across entire workflows, contain the blast radius of rogue agents, and gain absolute transparency into what your models are doing behind the scenes.

Most importantly, StateLoom is designed for absolute data sovereignty. Because it runs locally on your laptop or directly inside your enterprise network (VPC), it functions as an isolated, zero-trust control plane ensuring your data never passes through a third-party SaaS proxy.

Install

pip install stateloom

Requirements: Python 3.10+ (tested on 3.10, 3.11, 3.12, 3.13). See extras for optional dependencies.

Quick Start

pip install stateloom
stateloom start
# Dashboard is live at http://localhost:4782

Or use it in your code, wrap your regular calls within stateloom session or use stateloom.chat(model="gemini-2.5-flash", messages=[...]) directly.

stateloom.init()
claude = anthropic.Anthropic()

with stateloom.session("customer-report", budget=2.0, durable=True) as s:
    research = claude.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Key trends in AI governance 2025"}],
    )

Providers

Auto-detects and patches installed LLM clients:

Provider	Package	Auto-patched	Streaming
OpenAI	`openai`	Yes	`stream=True`
Anthropic	`anthropic`	Yes	`stream=True`
Google Gemini	`google-generativeai` or `google-genai`	Yes	`generate_content_stream()`
Cohere	`cohere`	Yes	`chat_stream()`
Mistral	`mistralai`	Yes	`chat.stream()`
LiteLLM	`litellm`	Yes	`stream=True`
Ollama (local)	—	Via `local_model=`	—

For Individual Developers (Free & Open Source)

StateLoom gives solo developers and startups everything needed to build resilient agents locally, for free. Here is what you can get out of using stateloom SDK.

Agent Cost Control

Session-scoped cost tracking — cost per agent run, not just per API call
Per-model cost breakdown — track cost and tokens per model within multi-model sessions
Budget enforcement — hard stop or warn when an agent run exceeds its spend limit

Agent Safety

PII detection — detect emails, credit cards, SSNs, API keys (audit/redact/block modes)
Guardrails — prompt injection detection (32 heuristic patterns + NLI classifier + Llama-Guard), jailbreak prevention, system prompt leak protection
Zero-trust security engine — CPython audit hooks (PEP 578) to intercept dangerous operations + in-memory secret vault

Performance & Caching

Exact-match & semantic caching — deduplication plus embedding-based similarity matching
Loop detection — catch agents spinning on the same query
Session timeouts & cancellation — max duration, idle timeout, and programmatic cancellation

Local Models & Smart Routing

Local model support — run Ollama models locally with hardware-aware recommendations
Intelligent auto-routing — route simple requests to local models based on complexity scoring
Model testing — test candidates against your production model with automated quality scoring and migration readiness reports

Agent Reliability

Durable resumption — Temporal-like checkpointing: agent crashes mid-run, restart, resume from cache
Semantic retries — automatic retries for LLM output failures (bad JSON, hallucinated tool calls)
Time-travel debugging — replay a failed agent run from any step with network safety
VCR-cassette mock — record LLM calls once, replay forever for zero-cost testing
Named checkpoints — mark milestones within a session for the dashboard waterfall timeline

Multi-Model & Experimentation

Multi-agent consensus — vote, debate, and self-consistency strategies with up to 3 models
A/B experiments — test model variants with built-in assignment, metrics, and backtesting
File-based prompt versioning — drop .md/.yaml/.txt files in a folder; auto-detect changes as new versions

Developer Experience

Unified chat API — provider-agnostic stateloom.chat() without importing any SDK
Session export/import — portable JSON bundles for sharing, archiving, and migration
LangChain / LangGraph integration — callback handlers for popular agent frameworks
Local dashboard & REST API — live session viewer, security controls, observability charts at localhost:4782

Agent CLI Integration

You already pay for Claude Pro or Gemini Ultra. Use your existing subscription through StateLoom — get cost tracking, PII scanning, budget enforcement, guardrails, and a session timeline for every agent run. No API key needed, no code changes.

stateloom start

export ANTHROPIC_BASE_URL=http://localhost:4782
claude "explain this codebase"
# All calls appear as one session in the dashboard

export CODE_ASSIST_ENDPOINT=http://localhost:4782/code-assist
gemini "refactor the auth module"

export OPENAI_BASE_URL=http://localhost:4782/v1
codex "add unit tests for the auth module"

All CLIs connect to the same StateLoom instance. Subscription users (Claude Max, Gemini Ultra, ChatGPT Pro) work transparently — OAuth tokens pass through to the upstream provider.

For Teams & Enterprise

StateLoom Enterprise Edition (EE) provides a single, centralized control plane to govern, secure, and optimize your entire AI workforce. We give engineering and finance teams absolute visibility into hierarchical org and team-level token billing, paired with centralized Virtual Key management and strict budget enforcements to automate chargebacks and prevent runaway costs. Accelerate your AI roadmap by empowering teams to spin up agents in secure, isolated sandboxes within your own private ecosystem, safely run live A/B experiments, validate cheaper models via risk-free dark launching, orchestrate high-assurance multi-agent consensus, and auto-generate proprietary training datasets through our distillation flywheel. Simultaneously, we provide your CISO with a unified zero-trust security perimeter: an in-memory Secret Vault, real-time prompt guardrails, declarative compliance profiles, dynamic force re-routing, and a global kill switch to instantly contain the blast radius of rogue agents. There is so much more you get. And the idea behind all of it is for you to keep the full control of the agents while allowing you to scale seamlessly.

👉 Book an Enterprise Demo Today

Key Examples

Multi-Provider Session

import stateloom
import anthropic
import google.genai as genai

stateloom.init()
claude = anthropic.Anthropic()
gemini = genai.Client()

with stateloom.session("customer-report", budget=2.0, durable=True) as s:
    # Step 1: Research (Claude)
    research = claude.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Key trends in AI governance 2025"}],
    )

    # Step 2: Analyze (Gemini)
    analysis = gemini.models.generate_content(
        model="gemini-2.5-flash",
        contents=f"Analyze: {research.content[0].text}",
    )

    # Step 3: Synthesize (Claude)
    report = claude.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2048,
        messages=[{"role": "user", "content": f"Write report: {analysis.text}"}],
    )

    print(f"Total: ${s.total_cost:.2f} | {s.total_tokens} tokens | {s.call_count} calls")

# Mix providers freely — StateLoom tracks cost across all of them in one session.
# If this script crashes on Step 3, restarting it skips Steps 1 & 2 for free.
# Budget enforcement stops the whole run if it exceeds $2.

PII Detection

from stateloom import PIIRule

stateloom.init(
    pii=True,
    pii_rules=[
        PIIRule(pattern="credit_card", mode="block"),
        PIIRule(pattern="email", mode="redact"),
    ],
)

Guardrails

stateloom.init(
    guardrails_enabled=True,
    guardrails_mode="enforce",             # "audit" or "enforce"
    guardrails_nli_enabled=True,           # NLI classifier (~5-20ms, optional)
    guardrails_local_model_enabled=True,   # Llama-Guard via Ollama
)

# Runtime toggle (no restart needed)
stateloom.configure_guardrails(nli_enabled=True, nli_threshold=0.8)

Kill Switch

stateloom.kill_switch(active=True, message="Maintenance in progress")
stateloom.add_kill_switch_rule(model="gpt-4*", reason="Cost overrun")

Zero-Trust Security

stateloom.init(
    security_audit_hooks_enabled=True,   # CPython audit hooks (PEP 578)
    security_audit_hooks_mode="enforce", # "audit" (log) or "enforce" (block)
    security_audit_hooks_deny_events=["subprocess.Popen", "os.system"],
    security_secret_vault_enabled=True,  # Move API keys to protected vault
    security_secret_vault_scrub_environ=True,  # Scrub from os.environ
)

# Check status
status = stateloom.security_status()

# Store/retrieve secrets programmatically
stateloom.vault_store("CUSTOM_SECRET", "value")
secret = stateloom.vault_retrieve("CUSTOM_SECRET")

Local Models & Auto-Routing

stateloom.init(local_model="llama3.2:3b", auto_route=True, default_model="gemini-2.5-flash")

# No explicit model — auto-router decides local vs cloud based on complexity
simple = stateloom.chat(messages=[{"role": "user", "content": "What is 2 + 2?"}])       # → local
hard = stateloom.chat(messages=[{"role": "user", "content": "Design a consensus algorithm"}])  # → cloud

Model Testing

stateloom.init(
    shadow=True,
    shadow_model="claude-haiku-4-5-20251001",  # Cloud-to-cloud candidate
    # shadow_model="llama3.2:8b",              # Or local model via Ollama
    shadow_sample_rate=0.5,                     # Test 50% of traffic
)
# Dashboard: localhost:4782 → Model Testing → see similarity scores

Durable Resumption

with stateloom.session(session_id="task-123", durable=True) as s:
    res1 = client.chat.completions.create(...)  # Cache hit on restart
    res2 = client.chat.completions.create(...)  # Resumes here

Multi-Agent Consensus

result = await stateloom.consensus(
    prompt="What are the key risks of deploying LLMs in healthcare?",
    models=["gpt-4o", "claude-sonnet-4-20250514", "gemini-2.5-flash"],
    strategy="debate",   # or "vote", "self_consistency"
    rounds=3,
    budget=1.00,
)
print(result.answer)       # Final synthesized answer
print(result.confidence)   # 0.0-1.0 confidence score
print(result.cost)         # Total cost across all models

# Agent-guided consensus — all debaters use the agent's system prompt
result = await stateloom.consensus(
    agent="medical-advisor",
    prompt="Should we use AI for radiology screening?",
    models=["gpt-4o", "claude-sonnet-4-20250514"],
    strategy="debate",
)

Managed Agents

agent = stateloom.create_agent(
    slug="legal-bot", team_id="team-1",
    model="gpt-4o",
    system_prompt="You are a legal assistant.",
)
# Call via: POST /v1/agents/legal-bot/chat/completions

Unified Chat API

stateloom.init(default_model="gpt-4o")
response = stateloom.chat(messages=[{"role": "user", "content": "What is the capital of France?"}])
print(response.content)  # "The capital of France is Paris."

Dashboard

Starts automatically at localhost:4782. Live session viewer, REST API, and WebSocket event streaming.

Overview — total cost, active sessions, cloud/local calls, PII detections, guardrail detections, cache hits
Sessions — list, detail with waterfall trace timeline, cost/token breakdown, child sessions
Experiments — A/B test management, metrics, leaderboard
Consensus — debate runs, round-by-round breakdown, per-model responses, confidence scores
Model Testing — candidate model evaluation, readiness scores, quality distribution, filtering funnel, migration recommendations
Safety — kill switch rules, blast radius status
Security — audit hooks, secret vault, event log, guardrails (config, detection stats, violation history)
Compliance — GDPR/HIPAA/CCPA profiles, audit trail
Observability — request rate, latency percentiles, cost breakdo BE73 wn charts

See full API endpoint reference for all REST endpoints.

Extras

pip install stateloom[langchain]    # LangChain callback handler
pip install stateloom[ner]          # GLiNER NER-based PII detection
pip install stateloom[semantic]     # Semantic caching (FAISS + sentence-transformers)
pip install stateloom[prompts]      # File-based prompt versioning (watchdog)
pip install stateloom[auth]         # OAuth2/OIDC authentication (pyjwt + argon2-cffi)
pip install stateloom[redis]        # Redis cache/queue backend
pip install stateloom[metrics]      # Prometheus metrics
pip install stateloom[tracing]      # OpenTelemetry distributed tracing
pip install stateloom[all]          # Everything

Configuration

stateloom.init(
    budget=10.0,              # Per-session budget (USD)
    pii=True,                 # Enable PII scanning
    guardrails_enabled=True,  # Prompt injection protection
    local_model="llama3.2",   # Enable local models + auto-routing
    shadow=True,              # Model testing
    circuit_breaker=True,     # Provider failover
    compliance="gdpr",        # GDPR/HIPAA/CCPA profiles
)

YAML config is also supported:

from stateloom.core.config import StateLoomConfig
config = StateLoomConfig.from_yaml("stateloom.yaml")

See full configuration reference for all options.

Error Handling

StateLoom is fail-open by default — observability middleware errors (cost tracking, latency, console output) never break your LLM calls. Security-critical middleware (PII blocking, budget hard-stop, guardrails in enforce mode) always fails closed.

For production security environments, set security middleware to enforce mode so violations block requests rather than just logging them:

stateloom.init(
    pii=True,
    pii_rules=[PIIRule(pattern="credit_card", mode="block")],  # Fail closed: block on match
    guardrails_enabled=True,
    guardrails_mode="enforce",          # Fail closed: block prompt injection
    security_audit_hooks_enabled=True,
    security_audit_hooks_mode="enforce", # Fail closed: block dangerous operations
)

from stateloom import (
    StateLoomBudgetError,        # Budget exceeded
    StateLoomPIIBlockedError,    # PII block rule triggered
    StateLoomGuardrailError,     # Prompt injection / jailbreak
    StateLoomKillSwitchError,    # Kill switch active
    StateLoomRateLimitError,     # Rate limit exceeded
    StateLoomRetryError,         # All retry attempts exhausted
    StateLoomTimeoutError,       # Session timed out
    StateLoomSecurityError,      # Security policy blocked operation
)

See full error reference for all error types.

Documentation

Full Feature Reference — detailed docs for every feature with examples, config tables, and YAML snippets
API Reference — complete index of public functions, classes, and enums

Contributing

We welcome contributions! Here's how to get started:

# Clone the repo
git clone https://github.com/stateloom/stateloom.git
cd stateloom

# Install in development mode
pip install -e ".[all]"

# Run tests
pytest tests/ -v

# Lint and format
ruff check src/
ruff format src/

Guidelines:

All changes need tests. Run pytest tests/ -v before submitting.
Follow existing code patterns — use ruff for formatting and linting.
Security-critical middleware must fail closed. Observability middleware must fail open.
Events must be typed dataclasses inheriting from Event.
Use contextvars.ContextVar for async/thread-safe state, not globals.

Reporting issues: GitHub Issues

License

BSL 1.1

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github		.github
assets		assets
docs		docs
examples		examples
src/stateloom		src/stateloom
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
OPENCLAW.md		OPENCLAW.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why StateLoom?

Table of Contents

Install

Quick Start

Providers

For Individual Developers (Free & Open Source)

For Teams & Enterprise

Key Examples

Multi-Provider Session

PII Detection

Guardrails

Kill Switch

Zero-Trust Security

Local Models & Auto-Routing

Model Testing

Durable Resumption

Multi-Agent Consensus

Managed Agents

Unified Chat API

Dashboard

Extras

Configuration

Error Handling

Documentation

Contributing

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Why StateLoom?

Table of Contents

Install

Quick Start

Providers

For Individual Developers (Free & Open Source)

For Teams & Enterprise

Key Examples

Multi-Provider Session

PII Detection

Guardrails

Kill Switch

Zero-Trust Security

Local Models & Auto-Routing

Model Testing

Durable Resumption

Multi-Agent Consensus

Managed Agents

Unified Chat API

Dashboard

Extras

Configuration

Error Handling

Documentation

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages