For the complete documentation index, see llms.txt. This page is also available as Markdown.

Architecture

What you'll learn: The three-package split, why the engine is in-process, and how data flows from your agent to the platform.


Three packages

Kyvvu is split into three packages with different licenses and deployment boundaries:

Package
License
What it does

kyvvu (SDK)

Apache 2.0

Translates agent events into atomic Behaviors via templates. Manages agent registration, task lifecycle, and the @kv.step decorator.

kyvvu-engine

BSL 1.1

Evaluates policies against Behaviors. Stateful: tracks per-task history. Zero I/O — pure CPU.

Kyvvu Platform API

Proprietary

Policy storage, incident management, behavioral trace ingestion, audit reporting, dashboard. Runs at platform.kyvvu.com.

pip install kyvvu installs both the SDK and the engine. The platform API is a hosted service.

Why the engine is in-process

The engine runs inside your agent process, not as a remote service. This is a deliberate architectural choice.

Performance. Agent steps are frequent and small. A task can emit dozens of atomic Behaviors. If each one required a network round-trip to a policy service, you'd add milliseconds per step that compound into seconds per task. The engine evaluates policies in sub-millisecond time:

Scenario
p99 latency

100 policies, 50-step history

296 us

10 policies, empty history

34 us

0 policies (baseline)

3 us

At four orders of magnitude faster than a typical LLM call, the engine adds no perceptible latency.

Visibility. A gateway proxy only sees LLM calls. It misses tool invocations, resource reads, credential fetches, and decision gates between model calls — precisely where the governable behaviour lives. The in-process engine sees everything the agent does.

Resilience. If the platform API is down, the engine continues evaluating with its cached policies. Agents don't stop working because a governance service is unreachable. For additional hardening:

  • Disk cache (KV_POLICY_CACHE_PATH): policies are written to disk after each successful fetch and loaded on cold start if the API is unreachable — surviving process restarts during outages.

  • Fail-closed mode (KV_POLICY_FAIL_MODE=closed): for high-risk agents, blocks all behaviors when no policies are available rather than failing open.

  • HMAC signing (KV_POLICY_HMAC_SECRET): verifies policy integrity to prevent tampering by compromised proxies or MITM within the internal network.

See the Configuration Reference for details on these settings.

For the full argument, see The Hot Path Tax.

Policies on paths

The engine evaluates policies against the full ordered history of the current task, not just the current step. This is the "policies on paths" model, formalised in the paper Runtime Governance for AI Agents: Policies on Paths.

Example: the same step.exec call is allowed or blocked depending on what happened earlier. If a step.gate (human approval) precedes it in the task history, it passes. If not, it's blocked. The decision is path-dependent.

This is what makes Kyvvu different from point-in-time content filters or LLM guardrails. A content filter checks one input at a time. Kyvvu checks the step in context — what the agent has already done, what resources it accessed, what approvals it received.

Data flow

What happens synchronously (hot path)

  • Template matching (SDK maps framework event to Behavior)

  • Policy evaluation (engine runs rules against history)

  • Step recording (engine appends to in-memory history)

All sub-millisecond, all in-process, all zero I/O.

What happens asynchronously (background)

  • Policy fetch (TTL-based, default every 300 seconds)

  • Log flush (on end_task(), POST to log endpoint)

  • Incident webhook (on warn/block, POST to incident endpoint)

Network failures in background operations are logged and swallowed. The engine never blocks on I/O during evaluation.

One engine per agent

Each KyvvuRunner (and each Kyvvu SDK instance) owns one PolicyEngine. Engines are per-agent and are not shared across agents. This is by design — policies are scoped to specific agents, and task histories must not leak between agents.

For LangGraph, the KyvvuLangChainHandler treats the entire graph execution as a single task — all nodes (LLM calls, tool calls) are flattened into steps within that task. Each graph invocation is one task, not one task per node.

For truly separate agents (e.g., independent microservices or processes), each agent gets its own Kyvvu instance. Cross-agent coordination happens through pre-fetched aggregate counts in EvalContext, not through shared engine state.


Next steps

Last updated