Agentic RL Control Center

Track policy, reward, and safety signals

Data catalog

Schema and sample exports for observations, actions, and rewards. Use for offline RL or external audit.

Observation schema
{
  "taskId": "string",
  "stepId": "string",
  "type": "reason|plan|tool|observe|critique|revise",
  "payload": {
    "input": "object",
    "output": "object"
  },
  "rewardDelta": "number",
  "safetyFlags": ["string"]
}
Reward schema
{
  "runId": "string",
  "episodeReturn": "number",
  "terms": [
    { "name": "Safety", "delta": "number" },
    { "name": "Outcome", "delta": "number" }
  ]
}
Sample exports

JSONL trace and Parquet-ready schema for offline RL.

Download JSONL

24 runs available. Use cohorts from tasks: Chronic care, Discharge follow-up call, Utilization management.