Skip to main content

Changelog

Version history for the Governance SDK. Each release lists every shipped change grouped by category.

v0.6.0

newApril 16, 2026

Ed25519 + HMAC rotation + Article 12 evidence export + fresh-install fix. Major release implementing F2–F9 of the v0.6 PRD plus a polish pass (Track A finish, LLM-theater tripwire closure, compliance export button, fresh-install wiring). Upgrade path for v0.5.x deployments: install the [migrations] extra and run governance migrate — it now applies the base DDL and runs alembic upgrade head in one shot. See the full release notes on PyPI.

Added

  • Ed25519 agent identity with file, env, and ephemeral keystores. Per-row Ed25519 signature over the HMAC audit chain. Graceful degradation to signature_status='unsigned_local_failure' when a signer cannot load its private key — the host call path never raises
  • HMAC chain key rotation with dual-signed rotation marker rows, salted fingerprint construction, bounded LRU resolution cache, and a governance rotate-chain-key CLI command
  • Article 12 evidence exportPOST /api/compliance/export packages the compliance report and chain verification into a single HMAC-signed JSON bundle (bundle_hash + bundle_signature) you can hand to a regulator. "Export Article 12 evidence" button in the v4 console with Windows-safe filename and screen-reader-announced download lifecycle
  • Wrapper coverage registry — opt-in registry of every wrapped LLM client at import time, surfaced via GET /api/coverage and the new /health/governance endpoint
  • GOVERNANCE_COMPLIANCE_RATE_LIMIT env var — overrides the default 1 req / 60s per-user ceiling on compliance endpoints

Changed

  • governance migrate now applies the base DDL and runs alembic upgrade head automatically — one command takes a new database all the way to HEAD. Closes a latent P0 where fresh v0.6 installs that ran DDL-only silently dropped audit rows with StoreUnavailableError against the missing signature_status column
  • killhalt rename across SDK, console, and audit events. Backward-compat aliases preserved in v0.6, removed in v0.7. Historic agent.killed rows stay in the chain as-is; a SQL view governance_audit_events_halted unions both kinds
  • Append-only grants on governance_audit_events enforced at the role level (previously only at the row-trigger level) — closes a latent invariant-2 gap

Upgrade notes

  • pip install "code-atelier-governance[migrations]>=0.6.0"
  • governance migrate --database-url $DATABASE_URL (runs DDL + alembic; idempotent)
  • Three new columns on governance_audit_eventssignature, signing_key_fingerprint, signature_status. Strict-schema SIEM / BI pipelines must widen
  • Queries filtering on kind='agent.killed' will silently stop matching new halt events. Either query the governance_audit_events_halted view or filter IN ('agent.killed', 'agent.halted')

v0.5.4

April 14, 2026

Kill switch enforcement hotfix. Closes a v0.5.3 production bug where the console halt button updated presence metadata and emitted an audit event, but the SDK enforcement path never read the kill marker. Agents kept running after being "killed." v0.5.4 wires scope.check() through a new presence kill check that fail-closes on operator-killed agents.

Kill switch enforcement (closes v0.5.3 production bug)

  • PresenceModule.is_killed(agent_id) and PresenceModule.assert_alive(agent_id) — read from a 5-second TTL in-memory cache backed by the operator-written kill marker on the presence rownew
  • AgentKilledError — new exception raised by assert_alive() with full kill metadata (operator, timestamp, reason). Inherits from RuntimeError, not from GovernanceError — a bare except ScopeViolation will NOT catch a kill. Catch explicitly with except AgentKilledError if you need to handle the kill in application codenew
  • ScopeModule.set_presence_module(presence) — wired automatically by GovernanceSDK.__init__ when both enable_scope=True and enable_presence=Truenew
  • scope.check() calls presence.assert_alive(agent_id) first — fires before PolicyNotRegistered and before the missing-tool-or-api ValueError. Defense-in-depth ordering verified by testnew
  • PresenceModule.force_refresh_killed_cache() — bypass the 5-second TTL on demand. Used by tests and by future push-based invalidation in v0.6new

Reliability — Invariant #1 preserved under DB outage

  • When the governance database is unreachable, the kill cache holds the last known state and a structlog warning is logged. Already-killed agents stay killed, live agents stay live, and the host application keeps running. The refresh path never raises out to the callernew
  • A failed refresh bumps the cache timestamp so the SDK does not hammer the database during a sustained outagenew
  • Detection uses the operator-written kill marker on the presence row, NOT the status column. The status column is overloaded by stale-heartbeat detection, so gating on it would also block agents that simply went idle. The metadata marker is the unambiguous kill signalnew

Tests

  • 441 → 469 backend tests. New: tests/presence/test_kill_switch.py (28 tests across 7 categories: AgentKilledError construction, happy path, cache TTL behavior including 50-concurrent double-checked locking, Invariant #1 DB outage resilience, in-memory mode, edge cases (unicode IDs, 500-agent scaling, partial metadata, un-kill, idempotent re-kill), and ScopeModule integration with two ordering tests)
  • Wheel build → clean-venv install → smoke test was run against real Postgres before publish (editable installs hide packaging bugs). The full scripts/live_test.py integration harness remains the source of truth for release gating

Scope deferred to v0.6

  • Only scope.check() is patched in v0.5.4. cost.check_budget(), gates.check(), and the LLM wrappers (wrap_anthropic, wrap_openai) still need kill-check wiring — that lands in v0.6 as part of the broader enforcement-path expansion
  • Cache invalidation is TTL-only — worst-case 5-second delay between an operator clicking halt and SDK enforcement. Sub-second invalidation via Postgres LISTEN/NOTIFY is on the v0.6 roadmap
  • No restore endpoint yet. To restore a killed agent, clear the kill marker from the presence row directly. v0.6 ships an admin restore action

Migration

  • Drop-in upgrade from v0.5.3. No schema migration required: kill detection reads from an existing JSONB metadata field that the v0.5.3 console already writes
  • pip install --upgrade code-atelier-governance==0.5.4

v0.5.3

April 13, 2026

Enforcement integrity patch: scope check wired into wrappers, projected budget gate, streaming token tracking, startup wrapper-coverage warning, on-demand HMAC chain verification API, Article 12 report coverage caveat, and a written threat model. Every enforcement surface now gates actions at the wrapper layer.

Enforcement wiring

  • wrap_openai and wrap_anthropic now call sdk.scope.check() with a chat.completions.create / messages.create sentinel before the LLM call — scope enforcement is active at the wrapper layer, not only at the tool-execution layernew
  • Projected budget gate — cost.check_or_raise() now projects forward using the call's declared max_tokens and denies the call before the stream opens if the projected total would exceed the capnew
  • Streaming token tracking — streaming calls record actual usage from the final usage object; fall back to the projected max_tokens ceiling if the API does not return a usage payloadnew
  • warn_on_no_wrappers — structlog warning at sdk.start() when no wrap_openai or wrap_anthropic has been registered. The only startup signal that enforcement is not covering your LLM calls. Default Truenew

Audit — on-demand chain verification

  • await sdk.audit.verify_chain(session_id=None, from_seq=None, to_seq=None) — public API for on-demand HMAC chain verification. Supports partial-range verification. Returns True on a clean chain; raises ChainIntegrityError with the bad sequence number on tampernew
  • verify_chain_on_read SDK config option — when True, verifies the chain on every get_events() call. Default False (O(n) in returned events). Use for compliance reporting or post-incident reviewnew

Compliance — Article 12 report integrity

  • ReportGenerator(audit_module=sdk.audit) — new constructor parameter for wiring the audit module to enable chain verification inside compliance reportsnew
  • generate_article12(verify_chain=True) and generate_summary(verify_chain=True) — new keyword-only parameter runs HMAC chain verification and sets chain_integrity_status to "verified" or "failed" in the report. Raises ValueError immediately if verify_chain=True is requested without an audit_module — no silent degradationnew
  • ComplianceReport.coverage_caveat — new field that always prints the scoping language: "This report covers only actions routed through the SDK wrappers. Actions made via direct LLM client calls are not included."new
  • ComplianceReport.coverage_pct — Pydantic-validated [0.0, 1.0] float or None. Reserved for v0.6 wrapper registry; always None in v0.5.x. The field is always present — its absence would imply 100% coveragenew

Config — opt-in module flags (CLAUDE.md invariant restored)

  • enable_loop (default True) — when False, sdk.loop is not constructed; accessing it raises AttributeError. Emits a structlog sdk.loop_disabled warning at initnew
  • enable_presence (default True) — when False, sdk.presence is not constructed. Emits a structlog sdk.presence_disabled warning at initnew
  • default_max_tokens — default ceiling used by the projected budget gate when the caller does not declare max_tokens on the API call. Suppresses the max_tokens_not_declared warning. Must be >= 1 (enforced by __post_init__)new
  • All module toggles now documented in the README "Configuration Reference" section with module toggles, audit options, wrapper options, and four deployment-pattern examplesnew

Docs — threat model + scoping language

  • New "Threat Model" section in the README documents what the SDK protects against (scope violations, budget overruns, HITL bypass, audit tampering) and what it does NOT protect against (direct client bypass, process-level bypass, streaming cost precision, on-demand tampering detection only, HITL non-blocking mode, tool invocations inside LLM responses)new
  • Positioning tightened across README and docs site — "enforcement gates for every action routed through the SDK" replaces unscoped claims. Article 12 compliance framed as evidence for actions the SDK observednew
  • Removed the "Prompt Versioning" claim from the shipping-feature list. enable_prompts remains as a forward-compat stubnew

Fixes

  • GovernanceSDK.close() now guards self.loop.close() and self.presence.close() with hasattrasync with GovernanceSDK(enable_loop=False) no longer raises AttributeError on teardown
  • _run_chain_verification return type narrowed to ChainIntegrityStatus literal — mypy strict clean across 65 source files
  • scripts/live_test.py switched to AsyncOpenAI (the sync client crashed inside the running event loop)

Tests

  • 436 → 441 backend tests. New: tests/test_v053_edge_cases.py (11 edge cases), partial-range chain verification boundary test, strengthened streaming cost-tracked assertion, 3 tests for close() with disabled modules, 2 tests for generate_article12/generate_summary(verify_chain=True) without audit_module raising ValueError, full end-to-end wrap_anthropic + budget + audit flow
  • 26/26 live integration tests passing against real Postgres and real OpenAI

v0.5.2

April 12, 2026

Console gate workflow hardening: reviewer tracking, batch approve with per-item self-approval blocking, deny rationale in the HMAC chain, halt endpoint, and SSE-backed presence broadcast.

Console - Gate Workflow

  • POST /api/gates/{id}/deny records a rationale in audit metadata so denials are tamper-evident in the HMAC chainnew
  • POST /api/gates/{id}/claim and /escalate track reviewer assignment and hand-offnew
  • POST /api/gates/batch-approve with a 50-request hard cap and per-item self-approval enforcement (one blocked item does not abort the batch)new
  • POST /api/agents/{id}/kill halt endpoint for operatorsnew
  • GET /api/events/stats aggregate counters for the console dashboardnew
  • SSE broadcast layer with session revocation on logoutnew

Database

  • Migration a1b2c3d4e5f6_gates_reviewer_columns adds claimed_by, claimed_at, escalated_to, escalated_at to governance_gates_pendingnew
  • New triggers.sql enforces append-only semantics on the audit log at the DB layernew

Security

  • Deny rationale is stored in application/json responses only — the console never renders user-supplied text as HTML, locked by regression testnew
  • Batch approve has a belt-and-suspenders runtime length guard in addition to the Pydantic field constraintnew
  • Self-approval check called inside the batch loop — an operator cannot approve their own actions in bulknew

Frontend (v4, opt-in via CONSOLE_UI_VERSION)

  • Parallel-route drill panel for the agents listnew
  • mapAgentStatus liveness heuristic with a 15-second idle threshold so dormant agents no longer render as pulsing greennew
  • Stream page pause indicator reflects paused state (disconnected > connecting > paused > connected precedence)new
  • Two-pass error-message sanitizer strips Bearer, token, secret, authorization values before displaynew
  • vitest harness with 34 frontend unit testsnew

Tests

  • 369 backend tests (up from 356), plus 34 vitest cases on the console frontend
  • New multi-agent live integration test (scripts/live_test_multi_agent.py) exercises 6 concurrent agents against real OpenAI, covering scope violation, budget exceeded, HITL contract, and loop detectionnew
  • Posture endpoint status-literal contract locked textually — frontend and backend break loudly on driftnew

v0.5.1

April 12, 2026

Hotfix covering four findings from a DX audit: three activation-consistency bugs and one silent HITL failure on Postgres. No audit data was lost or corrupted.

Security

  • HITL gates silently broken on Postgres. ContractsModule._check_hitl_approved returned False unconditionally for PostgresGatesStore, causing HITL-gated actions to be blocked even after human approval (over-blocking, not bypass). Fix: new GatesStore.has_granted_approval() abstract method with strict agent_id + expiry filtering
  • ScopeModule.filter_tools silently returned the full tool list when no policy was registered, bypassing hidden_tools and contradicting the module's documented default-deny contract. Now raises PolicyNotRegistered; the LangChain handler catches it and fails closed

Breaking changes

  • enable_audit, enable_scope, enable_cost, enable_gates, enable_prompts flags are now honored. In v0.2–v0.5.0 these flags were accepted by GovernanceConfig but never read. Customers setting any of these flags expecting them to disable the corresponding module should review their deployment immediately
  • ScopeModule.filter_tools("unknown_agent", ...) now raises PolicyNotRegistered instead of returning the full tool list

New module - Routing (model selection policy)

  • Advisory sdk.routing.suggest() remaps the requested model based on remaining budget (cost_aware) or explicit rewrite rulesnew
  • Off by default — both enable_routing=True AND a registered RoutingPolicy are requirednew
  • Honors ScopePolicy.allowed_models as a hard constraintnew
  • Emits routing.policy_changed and routing.suggestion audit eventsnew
  • Wraps wrap_openai and wrap_anthropic transparentlynew

Fixes

  • asyncio.run() no longer called from the sync registration path in scope, cost, and routing modules — removes a hidden sync-over-async deadlock risk (invariant #3)
  • Background policy-upsert tasks hold strong references via per-module _pending_upsert_tasks sets
  • ScopePolicy.allowed_models frozen set — a hard ceiling routing cannot exceed

Tests

  • 322 → 356 tests (+21 routing, +13 hotfix regression pins in tests/test_hotfix_v0_5_1.py)

v0.5.0

April 12, 2026

Self-approval prevention, chain fork detection, a sync facade for Flask/Django, and a shared-pool performance sweep that cut Postgres connection count from ~74 to ~15 per SDK.

Security

  • Self-approval prevention (fail-closed) — HITL gates compare the granting operator_id against the session's user_id; an agent cannot approve its own action. DDL adds an operator_id column to the gates tablenew
  • Chain fork detectionaudit.trace_session_chain raises ChainIntegrityError when two events share the same prev_hash, surfacing tamper attempts or concurrent-write corruptionnew
  • New coverage: budget race at the cap boundary, SQL injection payloads on every user-controllable field, case-sensitivity scope bypass, production error leakage via sanitize_db_error, weak-secret entropy rejection, account enumeration parity

New

  • GovernanceSDKSync — sync facade for Flask/Django and any non-async host. Runs an asyncio loop on a background thread and dispatches via run_coroutine_threadsafe. Matches the async SDK surface one-to-onenew
  • SSE endpoint GET /api/stream/events delivers live audit events to the console (session auth, keepalive frames)new
  • Halt agent UI — renamed from "kill" because the SDK blocks gates, it does not terminate the host processnew
  • Multi-agent OpenAI integration test script exercising delegation workloads end to endnew

Performance

  • Shared engine pool — consolidated seven separate AsyncEngine instances into one shared pool per SDK. Dropped Postgres max connections per SDK from ~74 to ~15
  • Concurrent audit writes — pre-call audit log backgrounded, post-call audit + cost tracking run under asyncio.gather, saving 4–12 ms per LLM call on the critical path
  • Combined budget query — session + daily counter reads merged into one round-trip in PostgresCostStore, halving pre-call enforcement latency

Fixes

  • Streaming cost-tracking bypass now detected and logged (users must call sdk.cost.track() manually after consuming the stream)
  • Serverless cold-start policy preload in sdk.start() eliminates the 30-second gap where _policies was empty on first request (AWS Lambda)
  • JSONL audit fallback tolerates read-only filesystems and rotates at 50 MB
  • Session time budget uses Postgres-side elapsed computation to avoid mixed-clock skew
  • Sync wrapper coroutine-leak fix in the Anthropic/OpenAI integrations
  • Policy upsert SQL cast corrected (::jsonbCAST AS jsonb) so scope and budget policies persist across restarts
  • Top-level __init__.py exports ScopePolicy, BudgetPolicy, AuditEvent
  • command_timeout=5 on the shared engine prevents pool exhaustion under slow-query storms

Tests

  • 258 → 322 tests (+64). New suites: streaming detection, JSONL fallback, cold start, sync wrapper, console endpoints, SSE, error handling, SQL injection, case-sensitivity bypass, chain fork detection, end-to-end enforcement. Test suite runs in ~5 s (was ~12 s)

v0.4.0

April 10, 2026

Behavioral contracts for pre/post conditions on tool calls, EU AI Act Article 12 compliance reports, and a native Anthropic SDK adapter with automatic cost tracking.

SDK - Behavioral Contracts (new module)

  • sdk.contracts module with declarative pre/post conditions on tool callsnew
  • PreConditions enforce that budget is available, scope is allowed, or HITL is approved before any tool firesnew
  • PostConditions verify audit was logged after tool executionnew
  • Context manager API: async with sdk.contracts.enforce(agent_id, session_id, tool):new
  • 18 test casesnew

SDK - EU AI Act Article 12 Compliance Reports (new module)

  • Automated compliance evidence reports mapping audit data to the seven sections of Article 12 (binding August 2, 2026)new
  • Three-state status model: compliant, partial, non_compliantnew
  • Privacy-preserving by default - hashes, not raw contentnew
  • Immutable frozen reports for internal compliance reviewnew
  • CLI: governance compliance reportnew
  • 10 test casesnew

Integrations - Anthropic SDK Adapter (new)

  • wrap_anthropic() patches message calls with automatic token trackingnew
  • USD cost estimation using Anthropic pricingnew
  • Audit event generation and budget enforcementnew
  • Async and sync supportnew
  • Idempotent - safe to patch multiple timesnew
  • Install via pip install "code-atelier-governance[anthropic]"new
  • 11 test casesnew

v0.3.0

April 10, 2026

Loop/anomaly detection, agent presence tracking, policy hot-reload, per-model cost aggregation, console GUI overhaul with Code Atelier branding, and security hardening.

SDK — Loop Detection (new module)

  • sdk.loop module with LoopPolicy and LoopDetectednew
  • await sdk.loop.record_call(agent_id, session_id, tool_name) — records tool call and checks for loopsnew
  • await sdk.loop.check(agent_id, session_id) — read-only loop status checknew
  • Sliding-window detection on (session_id, tool_name) pairsnew
  • Tool names normalized to lowercase (case-insensitive detection)new
  • Configurable action: raise (kill the loop) or log (observe only)new
  • Emits loop.detected audit event on detectionnew
  • DDL: governance_loop_tracking tablenew

SDK — Agent Presence (new module)

  • sdk.presence module with heartbeat-based lifecyclenew
  • await sdk.presence.heartbeat(agent_id) — mark agent as Livenew
  • await sdk.presence.mark_idle(agent_id) — transition to Idlenew
  • await sdk.presence.close_agent(agent_id) — remove from presence tablenew
  • await sdk.presence.list_agents() — all agents with statusnew
  • await sdk.presence.check_stale(timeout_seconds=300) — mark unresponsive agentsnew
  • Three states: Live, Idle, Unresponsivenew
  • DDL: governance_agent_presence tablenew

SDK — Policy Hot-Reload

  • GovernanceSDK(hot_reload=True, hot_reload_interval=30) — opt-in policy pollingnew
  • Polls governance_policies table at configurable intervalnew
  • Atomically replaces scope and cost policies in memorynew
  • In-process asyncio task — no background workernew

SDK — Cost Module

  • track_usage() now accepts model= for per-model cost trackingnew
  • await sdk.cost.model_breakdown(agent_id) — per-model cost aggregationnew

Console

  • Code Atelier brand alignment (violet accent, Inter/JetBrains Mono fonts)new
  • Login page with RBAC authenticationnew
  • User management page (admin only)new
  • Loading, empty, and error states on all pagesnew
  • NavBar with admin-only tabsnew
  • GET /api/agents/presence endpoint — agent statusnew
  • GET /api/cost/models endpoint — per-model cost breakdownnew

Security

  • Login rate limiting: 5 attempts per IP per 60 secondsnew
  • Gate audit events now included in HMAC chainnew
  • Metadata size cap: 64KB maximumnew
  • Request model size constraintsnew

CLI

  • governance migrate now includes loop and presence DDLnew

v0.2.2

April 10, 2026

Session time limits, built-in model pricing for 24 models, hidden tool policies, console RBAC authentication, and LangChain hidden tool filtering.

SDK — Cost Module

  • per_session_seconds field on BudgetPolicy — session time limits enforced in check_or_raise()new
  • Built-in pricing table for 24 models (OpenAI, Anthropic, Google, Meta, Mistral)new
  • await sdk.cost.track_usage(agent_id, session_id, model="gpt-4o", input_tokens=N, output_tokens=N) — auto-computes USDnew
  • Prefix matching for versioned model names (gpt-4o-2024-05-13 matches gpt-4o)new
  • Unknown models return $0 (non-breaking)new

SDK — Scope Module

  • hidden_tools field on ScopePolicy — tools removed from LLM context entirelynew
  • sdk.scope.filter_tools(agent_id, tool_list) — removes hidden tools before passing to LLMnew

Console

  • RBAC authentication with PBKDF2-HMAC-SHA256 password hashing (600k iterations, OWASP recommendation)new
  • Postgres-backed sessions with httpOnly cookies (8h TTL, revocation)new
  • Two roles: viewer (read-only) and admin (full access + user management)new
  • API endpoints: POST /api/auth/login, POST /api/auth/logout, GET /api/auth/menew
  • Admin endpoints: GET/POST /api/auth/users, PATCH /api/auth/users/{id}, DELETE /api/auth/sessions/{id}new

Integrations

  • LangChain handler: hidden tool filtering in on_llm_start (enforcement mode)new
  • OpenAI wrapper: auto USD via built-in pricing tablenew
  • OpenAI wrapper: double-wrap sentinel warns and returns if client already wrappednew

CLI

  • governance console add-user / list-users / disable-user / reset-password commandsnew
  • governance migrate now includes console DDL (auth tables)new

v0.2.1

April 10, 2026

Enforcement mode for LangChain, fail-closed cost gates by default, durable JSONL fallback, and a major Console upgrade with Grant/Deny buttons, event detail panel, and inline violation details.

SDK

  • LangChain handler enforce=True mode — scope enforcement through callbacksnew
  • wrap_openai double-wrap sentinel — calling it twice is now idempotentnew
  • model as first-class field on AuditEvent (included in the HMAC chain)new
  • Policies persisted to governance_policies Postgres tablenew
  • JSONL durable fallback — audit events survive process crashesnew
  • Weak secret rejection (minimum 8 unique bytes)new

Console

  • Grant/Deny buttons on pending approvalsnew
  • Event detail slide-out panel (full HMAC, prev_hash links, verify button)new
  • Inline violation details on posture cardsnew
  • Real Live/Idle status badge (no longer hardcoded)new
  • Recursive metadata redaction (nested PII stripped)new
  • Events page pagination (25/50/100 per page)new
  • CORS wildcard rejectionnew
  • Posture query performance bounds (LIMIT on all sub-queries)new

CLI

  • governance migrate --dry-run flagnew
  • Improved error messages for invalid UUIDsnew

Security

  • Fail-closed cost gate by defaultnew
  • Console auth warning for dev modenew

Docs

  • Fixed "5 lines" hero claimnew
  • Blue to violet design token consistency on sub-pagesnew
  • Quickstart split into SDK-only + Console sectionsnew
  • HMAC diagram field-coverage clarificationnew

v0.2.0

April 10, 2026

Initial public release of the Governance SDK with all four enforcement modules and the read-only Governance Console.

SDK

  • LangChain BaseCallbackHandler adapter
  • OpenAI SDK wrap_openai adapter
  • OpenTelemetry GenAI exporter
  • Multi-process correctness via Postgres advisory locks
  • HMAC-chained tamper-evident audit trail
  • Scope enforcement (tool/API whitelisting)
  • Spend limits with fail-closed budget gates
  • Human-in-the-loop signed approval tokens

Console

  • Initial release of the Governance Console (read-only dashboard)

CLI

  • governance CLI with migrate, verify, tail, and budget commands