What we publish before a model goes live
This page summarizes the 2026-06-11 model evidence pack in customer language: the test data, the gate result, the limits, and what it does not claim.
Failures stay visible
The evidence pack includes failed model runs, not only the best-looking candidate.
No silent promotion
The current long candidate is in SHADOW because it failed the live gate.
Simulation is labeled
Backtests and paper records are not mixed with live outcomes.
- Evidence date
- 2026-06-11
- Universe
- 45 crypto assets
- Holdout data
- 2021-06 to 2026-06
- Raw bars
- 1,714,025 bars
- Candidate trials
- 21 cumulative model trials
- Production promotions
- 0
What was tested
A long-side distribution forecaster was evaluated with purged walk-forward validation. The early 2021-06 to 2022-12 period was kept out of model selection and used as a true untouched holdout.
What the gate said
The strongest candidate reached PBO 0.029 but DSR 0.774, below the 0.90 live threshold. That means selection risk looked controlled, but regime consistency was not strong enough for production promotion.
What the results looked like
Across the five-year evaluation the candidate produced 18,790 simulated out-of-sample trades with a 57.4% win rate. The edge was regime-dependent: bull periods carried the result, bear periods were near flat, and chop was weak.
What this does not claim
The report is not a live performance claim and not an investment recommendation. Terminal simulations are optimistic because they cannot know the intrabar order of protective and target levels, funding costs are not fully modeled, and slippage can be worse in thin assets.
Champion's 5-year, period-by-period results
Each row is the out-of-sample result of a model trained only on its own past. The edge is real but regime-dependent — bull periods carry the result.
| Regime | Trades | Win rate | Avg net/trade |
|---|---|---|---|
| Bull (5 periods) | 5,166 | 69.4% | +2.88% |
| Bear (2 periods) | 9,819 | 52.2% | ≈ flat |
| Sideways / chop | 1,236 | 41.7% | −0.55% |
| Mixed (recent) | 2,569 | 60.8% | +0.99% |
| Total | 18,790 | 57.4% | — |
How well do the forecast ranges hold?
When the model says “price stays within this band with 80% probability,” across the last 120 days of real data (1,795 independent checks) price landed inside that band in ~86 of 100 cases. The probability ranges are closely calibrated to reality.
- q10–q90 band hit
- 85.9% (target 80%)
- q25–q75 band hit
- 57.6% (target 50%)
- Median forecast error
- 3.3%
- Independent checks
- 1,795
Controls that matter
- Every trained configuration is counted before DSR is calculated.
- Failed gates physically block promotion into production serving.
- Raw evidence files are covered by a SHA-256 manifest.
- Model registry state showed 14 bundles and 0 promotions at the evidence snapshot.
finsail llc · 1209 Mountain Road PL NE STE N, Albuquerque, NM 87110 · support@finsail.ai
