From Forecast Tournaments to Forecast-Powered Information Markets
- mokwa3
- Feb 20
- 10 min read
Six years ago, when we started Bakboka, we were convinced that expert forecasting could revolutionize decision-making. We believed that well-structured forecast tournaments would help reduce uncertainty and guide better policy choices. Over time, we came to a difficult realization: forecasting tournaments, despite their theoretical promise, have largely failed to influence real-world decisions.
Instead, what decision-makers actually need is not just probabilities, but reliable information. And yet, most current systems for evaluating expertise and sorting valuable insights are fundamentally flawed.
Through trial and error, we developed a different approach: leveraging forecast tournaments not as a decision-making tool, but as a mechanism to establish proof-of-expertise, which then serves as the foundation for an information market.
This shift is not just useful today—it is absolutely necessary in an AI-driven future. As AI dramatically lowers the cost of forecasting and content production, obtaining new, truthful information becomes more important than ever. An information market provides a mechanism to sort good information from bad, making the most pertinent information to the questions at hand more easily accessible.
This article will explore why forecast tournaments seemed so promising, and why they failed as a decision-making tool (part 1). It will then look at how we repurposed forecasting tournaments for sorting information with the establishment of a proof-of-expertise (part 2), and why this solution matters even more in an AI-driven world (part 3).
PART 1: Why Forecast Tournaments Failed to Influence Decisions
I. Why forecast tournaments seemed so promising
When we first turned to expert forecasting, we saw it as the perfect antidote to traditional geopolitical analysis, which suffers from several key flaws:
A. Bias accumulation
When a single individual or a small team filters and interprets information, their cognitive biases (confirmation bias, groupthink, overconfidence) inevitably shape the conclusions.
Even the most well-intentioned analysts are vulnerable to subjective framing. Paradoxically, it can be even worse when multiple analysts are involved.
A hierarchy often emerges, where junior analysts align their views with senior members rather than independently challenging assumptions. This reinforces institutional biases and discourages dissenting opinions.
B. Disconnection from decision-making
Analysts tend to frame analysis in questions they feel most confident about, rather than the ones decision-makers actually need to make choices.
As a result, reports often end up being theoretically interesting but operationally irrelevant.
C. Authority based on credentials, not performance
Influence is often granted based on educational background or reputation rather than demonstrable expertise.
We know from experience that some questions are better answered by street smarts than PhD, and that those who present their ideas persuasively are more likely to be considered experts, regardless of whether their arguments are empirically sound.
Forecast tournaments seemed like the perfect solution to all those issues: they decentralized expertise, forced forecasters to confront uncertainty, and created a measurable way to evaluate accuracy. However, despite these theoretical advantages, we encountered a fundamental problem: Decision-makers don’t use probabilities to make decisions.
II. Why forecast tournaments failed
A. Decisions Are Multi-Factorial: No Single Forecast Captures the Full Picture
Every major decision is influenced by multiple interdependent factors, often spanning political, economic, social, and security domains. The challenge is that:
Forecasts require specific, isolated questions – A forecasting tournament might ask, Will Country X experience a coup in the next six months? But a policymaker deciding on intervention or evacuation must consider a web of interrelated risks—economic stability, diplomatic fallout, military posture, and internal political dynamics.
Mapping the full decision space into forecast questions is impractical – In theory, one could break down a decision into dozens of individual forecastable components, but this quickly becomes unwieldy. Running a large enough tournament to cover all relevant sub-questions in real time is often infeasible due to cost, time, and complexity.
Decisions often require qualitative judgment – Some factors are difficult to quantify probabilistically, yet are crucial for decision-making. For example, a leader’s psychological disposition or internal factional pressure may significantly impact events but can’t always be meaningfully forecasted (they theoretically could but the setup would be counterproductive).
B. Forecasts Lack Direct Policy Relevance: Numbers Alone Don’t Dictate Action
A forecast tournament might accurately predict that there is a 65% chance of a major protest movement emerging in Country Y within the next six months. But what does that actually mean for policymakers?
Decision-making isn’t just about probabilities—it’s about thresholds for action. A 65% chance of a protest might be high, but is it enough to justify reallocating resources, issuing travel advisories, or engaging in diplomacy? Many decisions hinge on non-probabilistic considerations, such as strategic priorities, institutional constraints, or political optics. While we tried to push for actions pre-mapped to a probability, this rarely could be executed as planned.
Forecasts provide directions but rarely prescribe concrete actions. If a forecast suggests a 1% increase in the probability of violent conflict, should policymakers evacuate personnel? Increase information gathering? Strengthen local partnerships? The forecast alone doesn’t dictate the right move. One could organize counterfactual forecasts in order to capture that, but again, in reality, the energy required to do so would make it impractical to run.
Risk appetite and uncertainty tolerance vary. Different decision-makers interpret probabilities through their own lenses. A 30% risk of a banking crisis might terrify one policymaker but seem acceptable to another, depending on their objectives and level of risk tolerance. While there could be a lot of value in forcing institutional organizations to agree, synchronization actions pre-mapped probabilities for dozens of markets is impractical: the organizational cost of it is superior to the benefits.
C. Decision-Makers Want to Build Their Own Mental Models: Context Matters More Than Numbers
Forecasting tournaments assume that decision-makers primarily need more accurate probabilities. In reality, decision-makers want to build a map of the situation allowing them to i) act whatever the unfolding is or most often ii) change the probability of an event.
Raw probabilities mean little without explanatory depth. A forecast stating there’s a 70% chance of Country X defaulting on its debt provides no context on why that is likely. Decision-makers don’t just want the likelihood of an event; they want to understand the drivers, dynamics, and potential interventions.
Explanatory depths and information allow the construct of mental models. Leaders don’t simply act based on the likelihood of a single event—they use information to build a coherent understanding of the system they are navigating. This includes stakeholder interests, causal mechanisms, and second-order effects.
Decision-makers often think in contingencies: If Scenario A unfolds, we need Plan X. If Scenario B unfolds, we pivot to Plan Y. Probabilities alone don’t structure the full set of contingencies they need to prepare for. Sure probabilities are nice to have, but information and understanding allow a more modular and flexible approach to contingencies.
PART 2: An Information Market Powered by Proof-of-Expertise
We have seen in Part 1 that decision-makers were reluctant to use forecasts as a decision tool for a main reason: they value tools that increase insight more than tools that assign probabilities to events.
We witnessed this with the use of our platform: while decision-makers often ignored the numerical probabilities in our forecast tournaments, they were highly engaged with the rationales behind them.
These rationales were valuable for three reasons:
They structured complex information into digestible arguments.
They exposed different perspectives and causal reasoning.
They surfaced asymmetric information—pieces of insight that weren’t available elsewhere.
In other words, the true impact of forecast tournaments wasn’t in the forecasts themselves, but in their ability to surface, structure, and sort useful information. This insight led us to split our forecast tournament into 2 competitions: one that values accuracy, and another that values information. Instead of trying to force forecasts into decision-making, we pivoted our approach:
Forecasting should not be used to drive decisions directly—it should be used to determine expertise, which then powers an information marketplace.
How It Works: A Dual-Competition System for Sorting Information and Expertise
To solve the core problem of decision-relevant information getting lost in noise, we designed a system that combines two interlinked competitions:
1. An Information Market – where participants share insights, analyses, and context-rich explanations.
2. A Forecast Tournament – where participants prove their expertise by making accurate probabilistic predictions.
This system ensures that the best information surfaces to the top, while also providing a mechanism for continuously validating expertise in an objective way.
A. The Information Market: A Competitive Arena for Insights
At the core of decision-making is not just having information, but knowing what information to trust. The Information Market is where forecasters and analysts compete to provide the most valuable insights, structured in ways that are actionable and digestible for decision-makers.
What is the Information Market made of?
Forecast rationales that explain why certain outcomes are more or less likely.
Structured analyses that break down stakeholder interests, causal mechanisms, and second-order effects.
Unique or asymmetric information that is not widely available elsewhere.
How is information ranked?
Instead of traditional upvotes/downvotes (which can be manipulated by popularity or groupthink), the weight of each vote is determined by the forecaster’s accuracy in the Forecast Tournament.
This means that those with a demonstrated ability to predict accurately have more influence over which insights are promoted.
What does it solve?
Drastically makes it easier to distinguish useful information from the abundance of available content.
Allows to escape the true/untrue dichotomy by replacing it by useful / not useful.
Traditional models of information curation—media, think tanks, academic credentials—are based on reputation rather than verifiable expertise.
The Information Market ensures that the best-ranked content is curated by those who have proven they understand reality better than others.
B. The Forecast Tournament as a Proof-of-Expertise
Unlike traditional analysis, where expertise is assumed based on credentials, the Forecast Tournament continuously tests who actually understands events best.
How does it work?
Participants make probabilistic predictions on a series of carefully designed geopolitical, economic, and strategic questions. Those questions are chosen by the community to represent a thematic space, and to represent a provable area of expertise.
Over time, their forecasts are evaluated using objective scoring methods (like Brier scores or Area Under Curve). With enough data, expertise can be differentiated from luck and real expertise can be discovered.
Accuracy is not just rewarded—it becomes the currency of influence in the Information Market.
What makes this different from other forecasting models?
Many forecasting projects stop at prediction, but this system goes further by using forecasts to determine who gets to shape the flow of information.
The combination of prediction and information-sharing creates a self-reinforcing ecosystem where accurate forecasters have more influence over information, and high-quality information leads to better forecasts.
How Forecasting Accuracy Translates Into Information Influence?
A key innovation of this system is that forecasting performance isn’t just measured—it determines who gets to curate information.
Proven forecasters gain more influence over what information is prioritized. If a participant has consistently high forecasting accuracy, their votes on information rankings carry more weight. If someone has a poor forecasting record, their ability to influence what gets surfaced is reduced.
This ensures that the best-ranked information is curated by those with a track record of being right.
What does it solve?
It eliminates reliance on proxies for expertise. No more assuming that a PhD, a government title, or media prominence equals good analysis.
It prevents misinformation from dominating. Unlike social media-style ranking, where viral but inaccurate claims can spread easily, this system ensures only those with a history of getting things right shape the narrative.
We transform availability bias into a strength – Instead of allowing decision-makers to be influenced by the loudest voices or the most sensational narratives, we ensure that the most visible and frequently encountered insights come from those with proven forecasting accuracy. This means that decision-makers are effectively “swimming” in the mental models of the most skilled experts, reinforcing useful, reality-based thinking rather than noise-driven intuition
PART 3: Proof-of-Expertise in AI-driven forecasting
This information market coupled with a forecast tournament isn’t just about improving human forecasting markets today—we think it’s critical as AIs become better forecasters , as it allows multiple advantages:
High-Quality Training Data – Pre-training large models on static datasets is reaching its limits. As AI-generated content floods the digital space, access to high-quality, verifiable information will become the key differentiator. A forecasting-based information market ensures that only insights curated by proven experts are surfaced, providing a continuous stream of high-quality, structured, and validated data that can serve as a superior training source for future AI models
Verifiability: The Key to AI Training and Reinforcement Learning – For reinforcement learning to be effective, AI needs access to verifiable tasks. Prediction markets and forecast tournaments provide this by offering clear, measurable outcomes—a forecast is either accurate or inaccurate over time. Similarly, in the information market, we can assess whether specific insights contributed to better forecasts. AI forecast tournaments can thus become the playground for RL to happen, and where AI can test both its accuracy and idea articulation.
Forecasting AI Needs a system to get real-world data– AI models, no matter how powerful, remain confined to digital spaces unless they are continuously fed real-world data. As AI becomes better at reasoning, the next bottleneck will not be computation but access to high-quality, predictive information. AI’s forecasting power will be limited by the quality of information it has access to. The real competitive edge will not just be in AI’s processing power but in access to a tailored flow of information. This is where an incentivized information market comes into play. By linking prediction markets to an information market, contributors are motivated to bring forward the most relevant, highest-value information—data that actually improves predictive performance. By incentivizing the information market to bring information that helps forecasters forecast, you are driving the collection of information, contrary to a social network such as X or Facebook.
Improving Explainability Through Trackable Information Streams – One of the biggest challenges in AI-driven decision support is understanding where conclusions come from. In this Information Competition x Forecast Tournament system as we developed, every winning forecast is linked to a trackable stream of the information used to generate it. This provides a verifiable, demonstrable audit trail of what insights influenced what predictions, offering a new layer of transparency and explainability. It can also unlock a new financial incentives system, where information is retributed according to its weight in successful forecasts. While we run something similar for our human system, this can be better automated in future AI forecasting tournaments, especially if the trend set by deepseek r1 to expose reasoning steps is confirmed.
There are many things we left out, but that we will try to expand on in later pieces, like
The mechanism of the incentives, and how such a system can generate and distribute revenues
How AI forecasters can help alleviate some of the issues of human forecasting tournaments (especially in the full mapping of the decision space), and how an Information Market will still make sense in that case
What governance should exist for a full-blown forecasting and information tournament ideally look like?
What impact could it have on public policy and citizen engagement?
Can it be efficient to alleviate “superhuman persuasion" and AI-driven disinformation campaigns?
Comments