TP
ARGUS/Scorecard
ARGUS · The crown jewel

Accuracy scorecard

Rolling metrics across all hypotheses. Hit rate, Brier score, calibration curve, decomposition by sector / region / type. Public version available at argus.northflow.no.

30-day hit rate
71.2%+2.80%
38 of 47 resolved
90-day hit rate
68.4%+1.40%
122 of 138 resolved
180-day hit rate
66.1%
265 of 287 resolved
Brier score
0.187
Lower is better — perfect = 0
Calibration curve · predicted vs realized

If our 0.75-confidence predictions came true 75% of the time, the curve sits on the diagonal — ours does (R² 0.94).

Hit rate by period
By sector
  • Energy74.0%
  • Financials69.0%
  • Industrials72.0%
  • Information Technology66.0%
  • Health Care71.0%
  • Materials62.0%
By region
  • NORDIC73.0%
  • EUROPE70.0%
  • ASIA_PACIFIC66.0%
  • AMERICAS68.0%
Methodology

Every hypothesis is published with a SHA-256 hash before resolution. The scorecard recomputes nightly across all resolved hypotheses. Brier score is the mean squared error between predicted probability (confidence) and realized outcome (0 or 1). Calibration measures whether predicted probabilities match realized hit rates within their bucket. No survivorship bias — expired hypotheses count as misses unless directionally correct.