Accuracy scorecard

Rolling metrics across all hypotheses. Hit rate, Brier score, calibration curve, decomposition by sector / region / type. Public version available at argus.northflow.no.

30-day hit rate

71.2%+2.80%

38 of 47 resolved

90-day hit rate

68.4%+1.40%

122 of 138 resolved

180-day hit rate

66.1%

265 of 287 resolved

Brier score

0.187

Lower is better — perfect = 0

Calibration curve · predicted vs realized

If our 0.75-confidence predictions came true 75% of the time, the curve sits on the diagonal — ours does (R² 0.94).

Hit rate by period

By sector

Energy74.0%
Financials69.0%
Industrials72.0%
Information Technology66.0%
Health Care71.0%
Materials62.0%

By region

NORDIC73.0%
EUROPE70.0%
ASIA_PACIFIC66.0%
AMERICAS68.0%

Methodology

Every hypothesis is published with a SHA-256 hash before resolution. The scorecard recomputes nightly across all resolved hypotheses. Brier score is the mean squared error between predicted probability (confidence) and realized outcome (0 or 1). Calibration measures whether predicted probabilities match realized hit rates within their bucket. No survivorship bias — expired hypotheses count as misses unless directionally correct.