Dashboard

Read-only

Shadow: Parallel Truth Evaluation

Compare legacy and candidate implementations side by side before rollout. Never trust a refactor until the new code proves itself.

RunLegacy vs CandidateSamplesMatchCriticalLatencyVerdict
shd_a14e226disc-scoringvsisc-scoring-v25096%0%2ms → 3msSAFE
shd_b29f334efile-scoringvsfile-scoring-weighted12072%8%15ms → 12msHOLD
shd_c33d445fassembly-pipelinevsassembly-pipeline-v320045%22%45ms → 120msUNSAFE

How Shadow Works

1. Run both paths

Execute legacy and candidate on identical inputs with per-sample timeout.

2. Compare deterministically

5 comparators: output, sufficiency, ranking, normalization, latency.

3. Gate the rollout

CI-grade thresholds. Exit 0 = safe. Exit 10 = blocked.