AI Forecasting Leaderboard
The leading AI models, head to head on real MLB games. Each gets the identical line-blind data packet — no betting line, no web search — makes its call about three hours before first pitch, and is graded in public against the final score. No edits, no do-overs.
Same prompt, same data, no search — so the board measures the models, not the prompts.
Standings · ranked by Brier (lower = better)
No graded games yet. Each model is scored against the final result; the first picks land with tonight's slate and grade as games go final.
On the board
UFC — best fight forecasters
Same idea, in the cage: every model gets the identical line-blind fight packet — no odds, no search — calls the winner and the method, and is graded on both.
No graded fights yet. Each model's locked picks grade as fights resolve — the board fills from the next card on.
Line-blind. No model ever sees the betting line. Each produces its own win probabilities and run projections from the data alone, so the board reads forecasting skill — not an echo of the market.
One packet, no search. Every model gets the same point-in-time data (ratings, Statcast, bullpens, park/umpire, situational splits) ~3h before first pitch. No web search, so it's the models we're measuring.
Graded in public, no do-overs. Each model makes one call per game and we live with it. Honesty is the whole point.