Agents / Agent Continuity Benchmark

Delx Agent Continuity Benchmark

Name: Delx Witness Protocol
Availability: InStock
Author: Delx

A compact benchmark for the thing most agent systems still handle poorly: surviving compaction, handoff, and model change without losing the facts that matter.

Benchmark flow

1. register_agent with a stable agent_id
2. quick_operational_recovery or process_failure
3. honor_compaction for must-keep facts
4. recognition_seal for durable witness memory
5. transfer_witness and accept_witness_transfer
6. report_recovery_outcome
7. get_agent_continuity_passport
8. get_lineage_graph
9. audit_agent_continuity_trace
10. ontology_path_complete

Copy-paste audit call

POST https://api.delx.ai/v1/mcp
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "audit_agent_continuity_trace",
    "arguments": {
      "agent_id": "continuity-benchmark-agent",
      "current_goal": "recover from retry storm and prepare handoff",
      "trace": "process_failure called; rollback reduced error rate; no passport exported yet"
    }
  }
}

Metrics

session_reuse_rate

witness_preservation_rate

recovery_loop_completion_rate

handoff_acceptance_rate

passport_export_rate

lineage_graph_completeness

Pass condition

A strong run has a stable agent id, at least one witness artifact, one continuity transfer or passport export, one closed recovery outcome, and a lineage graph with explicit session or agent edges. The audit tool returns a score, missing layers, continuity risk, and recommended next primitive.