Runtime policy enforcement and audit infrastructure for AI agents.
Welcome to the Kyvvu platform documentation. Kyvvu provides the compliance plumbing that sits between your AI agents and your organization's policies: logging their execution, evaluating their behavior against configurable rules, generating incidents when things go sideways, and producing audit-ready reports. All without rewriting your agents from scratch.
Whether your policies come from internal governance requirements, industry standards, or the EU AI Act — Kyvvu enforces them at runtime, not after the fact.
What Kyvvu Does
Organizations deploying AI agents face a shared set of operational problems: documenting what agents exist and what they do, verifying that agents behave according to defined rules, tracking violations when they occur, and being able to demonstrate all of this to auditors, regulators, or management. Kyvvu operationalizes these requirements at runtime.
Concretely, Kyvvu does four things:
1. Agent Registration Every AI agent registers itself with the platform, declaring its name, purpose, owner, risk classification, and deployment environment. Registration triggers policy evaluation — an agent that registers without a declared owner_id, for example, can immediately generate a compliance incident. Think of registration as the agent announcing "I exist, here's what I am" — and the platform deciding whether that's good enough.
2. Execution Logging As your agent runs — making LLM calls, invoking tools, checking for PII, requesting human approval — each step is logged to an immutable, hash-chained audit trail. You can prove not just what your agent decided, but when, and in what order. Tampering is detectable.
3. Policy Evaluation Policies are configurable rules evaluated at agent registration and/or during execution. A policy might say: "all high-risk agents must declare an owner", or "every LLM call must be preceded by a PII check". When a violation is detected, an incident is created automatically. Policies can be scoped by risk classification, by specific agent, or by node type.
4. Incident Management & Reporting Incidents are policy violations with a full lifecycle: open → active → resolved (or ignored, if you're feeling optimistic about the risk). Incidents can trigger automated actions — Slack alerts, webhook calls, email to your security team. And at any point, you can generate a PDF or XML audit report covering any time window.
The API is the authoritative backend. It stores everything, evaluates all policies, and exposes a REST interface consumed by both agents and the dashboard. Authentication is split by role: agents authenticate with API keys, human users authenticate with JWT tokens obtained via email/password login.
The SDK is the thin Python layer you add to a Python-based agent. Three lines of setup, one decorator per step. It handles task management, step numbering, hash chain construction, and retry logic. The SDK also provides a LangChain callback handler for LangChain-based agents. Agents built in other languages or frameworks can integrate directly with the two agent-facing API endpoints.
The Dashboard is the operational interface for your compliance team. It is a single-page application that talks to the API via JWT-authenticated requests. No build step required — just serve the static files.
Key Concepts
Agents
An agent is any AI system registered with the platform. Agents self-register by calling POST /api/v1/agents with an agent_key (a stable identifier you choose) and a set of descriptive fields. The platform derives a unique agent_id from the hash of your agent_key and your API key — so re-registering the same agent updates it rather than creating a duplicate.
Agents carry metadata relevant to governance and compliance: purpose, owner_id, maintainer_id, risk_classification, and environment. Missing or non-compliant values here are the most common source of first-run incidents.
Tasks and Steps
A task is a single end-to-end execution of your agent — processing one support ticket, answering one question, completing one workflow run. Each task is identified by a task_id.
A step is an individual operation within a task, recorded by calling POST /api/v1/logs. Steps are numbered sequentially and typed with a node_type:
Node Type
Meaning
START_NODE
Beginning of a task
END_NODE
Completion of a task
LLM_CALL
A call to a language model
TOOL_CALL
An external tool or API invocation
MEMORY_CALL
Memory read or write
HUMAN_APPROVAL
A human-in-the-loop checkpoint
PII_CHECK
A check for personally identifiable information
DECISION
A conditional branch or routing decision
ERROR
An error handling step
The sequence of steps forms an immutable, hash-chained log. Each step's hash incorporates the previous step's hash, making the chain tamper-evident and independently verifiable.
When using the Python SDK, task_id generation, step numbering, and hash chaining are handled automatically. When calling the API directly, these are the caller's responsibility.
Policies
A policy is an instantiation of a rule applied to specific conditions. The rule defines what to check (e.g., "this field must not be empty"). The policy defines when and for whom (e.g., "for high-risk agents, at registration time").
Policies have three scopes:
agent_registration — evaluated when an agent registers or updates itself
step_execution — evaluated on each logged step during task execution
task_execution — evaluated at the END_NODE of a task (useful for "was X done at some point during this task?" checks)
Policies can be narrowed by risk_classification (e.g., apply only to high-risk agents) and by agent_id (apply only to a specific agent). Policies with neither filter apply universally. This lets you express a tiered policy structure — stricter rules for higher-risk systems — without maintaining separate policy sets for each.
Pre-built policy templates let you apply a full governance framework in one operation.
Incidents
An incident is a policy violation. It is created automatically when a policy check fails. Incidents are permanent audit records — they cannot be deleted, only resolved or ignored. Each incident tracks which agent violated which policy, in which environment, at which step, and when.
Incident status follows a lifecycle:
For agent_registration policies, re-registering a now-compliant agent will auto-advance the incident to fixed. For step_execution and task_execution incidents, manual review is required — the execution already happened and the record stands.
Environments
Agents and their incidents are scoped to an environment: development, qa, staging, production, or copilot. Compliance status is tracked independently per environment. Fixing a violation in development does not resolve the corresponding incident in production. This is by design — your staging environment is not your production environment, and your audit trail should reflect that.
Installing the API, SDK, and dashboard in your environment
Quick Links
Live demo: https://demo.kyvvu.com
API (production): https://dashboard.kyvvu.com
Interactive API docs: https://dashboard.kyvvu.com/api/docs
GitHub: https://github.com/Kyvvu/platform
A Note on Philosophy
Kyvvu is infrastructure, not a checkbox. Policy tooling that only runs at reporting time is a liability — it tells you what went wrong after it already happened. Kyvvu evaluates policies at the moment an agent acts, creates incidents in real time, and maintains an audit trail that cannot be retroactively altered.
That said, compliance tooling should not get in the way of building good AI systems. The SDK requires minimal changes to existing agent code. The API is straightforward REST. The dashboard surfaces the information your governance team actually needs, without making engineers wade through it.
The goal is a system where policy enforcement is something your agents do automatically, not something you reconstruct from scattered logs after the auditor calls.