Benchmarks
Track dataset governance metadata and longitudinal performance across policies.
Post-discharge QA
Internal
Last refresh 9/27/2025 · Over-represents cardiology cases; flagged for balancing next refresh.
Synthetic cohort of discharge follow-ups with annotated safety signals.
Snapshot 1
Accuracy 82%Recall 77%Avg reward 1.80Snapshot 2
Accuracy 86%Recall 79%Avg reward 2.10Legacy prior-auth
Internal
Last refresh 8/3/2025 · Contains older prompts; ensure compatibility before reuse.
Archived encounters for offline RL benchmarking.
Snapshot 1
Accuracy 76%Recall 71%Avg reward 1.20Snapshot 2
Accuracy 80%Recall 73%Avg reward 1.35