# Architecture

**What you'll learn:** The three-package split, why the engine is in-process, and how data flows from your agent to the platform.

***

## Three packages

Kyvvu is split into three packages with different licenses and deployment boundaries:

| Package            | License     | What it does                                                                                                                           |
| ------------------ | ----------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| `kyvvu` (SDK)      | Apache 2.0  | Translates agent events into atomic Behaviors via templates. Manages agent registration, task lifecycle, and the `@kv.step` decorator. |
| `kyvvu-engine`     | BSL 1.1     | Evaluates policies against Behaviors. Stateful: tracks per-task history. Zero I/O — pure CPU.                                          |
| Kyvvu Platform API | Proprietary | Policy storage, incident management, behavioral trace ingestion, audit reporting, dashboard. Runs at `platform.kyvvu.com`.             |

`pip install kyvvu` installs both the SDK and the engine. The platform API is a hosted service.

## Why the engine is in-process

The engine runs inside your agent process, not as a remote service. This is a deliberate architectural choice.

**Performance.** Agent steps are frequent and small. A task can emit dozens of atomic Behaviors. If each one required a network round-trip to a policy service, you'd add milliseconds per step that compound into seconds per task. The engine evaluates policies in sub-millisecond time:

| Scenario                      | p99 latency |
| ----------------------------- | ----------- |
| 100 policies, 50-step history | 296 us      |
| 10 policies, empty history    | 34 us       |
| 0 policies (baseline)         | 3 us        |

At four orders of magnitude faster than a typical LLM call, the engine adds no perceptible latency.

**Visibility.** A gateway proxy only sees LLM calls. It misses tool invocations, resource reads, credential fetches, and decision gates between model calls — precisely where the governable behaviour lives. The in-process engine sees everything the agent does.

**Resilience.** If the platform API is down, the engine continues evaluating with its cached policies. Agents don't stop working because a governance service is unreachable. For additional hardening:

* **Disk cache** (`KV_POLICY_CACHE_PATH`): policies are written to disk after each successful fetch and loaded on cold start if the API is unreachable — surviving process restarts during outages.
* **Fail-closed mode** (`KV_POLICY_FAIL_MODE=closed`): for high-risk agents, blocks all behaviors when no policies are available rather than failing open.
* **HMAC signing** (`KV_POLICY_HMAC_SECRET`): verifies policy integrity to prevent tampering by compromised proxies or MITM within the internal network.

See the [Configuration Reference](/deployment/configuration.md) for details on these settings.

For the full argument, see [The Hot Path Tax](https://kyvvu.com/blog/2026/04/29/hot-path-tax/).

## Policies on paths

The engine evaluates policies against the **full ordered history** of the current task, not just the current step. This is the "policies on paths" model, formalised in the paper [Runtime Governance for AI Agents: Policies on Paths](https://arxiv.org/abs/2603.16586).

Example: the same `step.exec` call is allowed or blocked depending on what happened earlier. If a `step.gate` (human approval) precedes it in the task history, it passes. If not, it's blocked. The decision is path-dependent.

This is what makes Kyvvu different from point-in-time content filters or LLM guardrails. A content filter checks one input at a time. Kyvvu checks the step in context — what the agent has already done, what resources it accessed, what approvals it received.

## Data flow

```
Agent code
  |
  |  @kv.step / LangChain handler
  v
SDK (template matching)
  |
  |  Behavior object
  v
Engine (PolicyEngine.evaluate)
  |
  |  Reads: cached policies + task history
  |  Returns: allow / warn / block
  v
Agent code (continues or catches KyvvuBlockedError)
  |
  |  On task completion: end_task()
  v
Engine (KyvvuRunner)
  |  Async: flush behavioral trace to log endpoint
  |  Async: fire incident webhook on violations
  v
Platform API
  |  Stores: logs, incidents, events
  v
Dashboard
```

### What happens synchronously (hot path)

* Template matching (SDK maps framework event to Behavior)
* Policy evaluation (engine runs rules against history)
* Step recording (engine appends to in-memory history)

All sub-millisecond, all in-process, all zero I/O.

### What happens asynchronously (background)

* Policy fetch (TTL-based, default every 300 seconds)
* Log flush (on `end_task()`, POST to log endpoint)
* Incident webhook (on `warn`/`block`, POST to incident endpoint)

Network failures in background operations are logged and swallowed. The engine never blocks on I/O during evaluation.

## One engine per agent

Each `KyvvuRunner` (and each `Kyvvu` SDK instance) owns one `PolicyEngine`. Engines are per-agent and are not shared across agents. This is by design — policies are scoped to specific agents, and task histories must not leak between agents.

For LangGraph, the `KyvvuLangChainHandler` treats the entire graph execution as a single task — all nodes (LLM calls, tool calls) are flattened into steps within that task. Each graph invocation is one task, not one task per node.

For truly separate agents (e.g., independent microservices or processes), each agent gets its own `Kyvvu` instance. Cross-agent coordination happens through pre-fetched aggregate counts in `EvalContext`, not through shared engine state.

***

## Next steps

* [Atomic Behaviours](/core-concepts/behaviours.md) — the 12 behaviour types the engine operates on
* [Policies and Rules](/core-concepts/policies.md) — how policies are structured and evaluated
* [Tasks and History](/core-concepts/tasks.md) — task lifecycle and path-dependent evaluation


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.kyvvu.com/core-concepts/architecture.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
