src.pm_rank.model

Model subpackage for pm_rank.

Submodules

Classes

GeneralizedBT

Generalized Bradley-Terry model for ranking forecasters in prediction markets.

BrierScoringRule

Brier scoring rule for evaluating probabilistic forecasts.

SphericalScoringRule

Spherical scoring rule for evaluating probabilistic forecasts.

LogScoringRule

Logarithmic scoring rule for evaluating probabilistic forecasts.

AverageReturn

Average Return Model for ranking forecasters based on their expected market returns.

Functions

spearman_correlation(→ float)

Compute the Spearman correlation between two rankings.

kendall_correlation(→ float)

Compute the Kendall correlation between two rankings.

Package Contents

class src.pm_rank.model.GeneralizedBT(method: Literal['MM', 'Elo'] = 'MM', num_iter: int = 100, threshold: float = 0.001, verbose: bool = False)

Bases: object

Generalized Bradley-Terry model for ranking forecasters in prediction markets.

This class implements a generalization of the traditional Bradley-Terry model to handle prediction market scenarios. Each event outcome is treated as a contest between two “pseudo-teams”: a winning team (the realized outcome) and a losing team (all other outcomes). Each forecaster contributes fractions of their capability proportional to their predicted probabilities.

The model estimates skill parameters for each forecaster using an iterative Majorization-Minimization (MM) algorithm, which provides convergence guarantees and intuitive comparative scores similar to Elo ratings.

Parameters:
  • method – Optimization method to use (“MM” for Majorization-Minimization).

  • num_iter – Maximum number of iterations for the MM algorithm (default: 100).

  • threshold – Convergence threshold for parameter updates (default: 1e-3).

  • verbose – Whether to enable verbose logging (default: False).

Initialize the generalized Bradley-Terry model.

Parameters:
  • method – Optimization method to use (“MM” for Majorization-Minimization).

  • num_iter – Maximum number of iterations for the MM algorithm (default: 100).

  • threshold – Convergence threshold for parameter updates (default: 1e-3).

  • verbose – Whether to enable verbose logging (default: False).

method = 'MM'
num_iter = 100
threshold = 0.001
verbose = False
logger
fit(problems: List[pm_rank.data.base.ForecastProblem], include_scores: bool = True) Tuple[Dict[str, Any], Dict[str, int]] | Dict[str, int]

Fit the generalized Bradley-Terry model to the given problems.

This method estimates skill parameters for each forecaster using the MM algorithm and returns rankings based on these parameters. The skill parameters represent the relative predictive ability of each forecaster.

Parameters:
  • problems – List of ForecastProblem instances to evaluate.

  • include_scores – Whether to include scores in the results (default: True).

Returns:

Ranking results, either as a tuple of (scores, rankings) or just rankings.

class src.pm_rank.model.BrierScoringRule(negate: bool = True, verbose: bool = False)

Bases: ScoringRule

Brier scoring rule for evaluating probabilistic forecasts.

The Brier score is a quadratic proper scoring rule that measures the squared difference between predicted probabilities and actual outcomes. It is widely used in prediction markets and provides a good balance between rewarding accuracy and calibration.

Parameters:
  • negate – Whether to negate the scores so that higher values are better (default: True).

  • verbose – Whether to enable verbose logging (default: False).

Initialize the Brier scoring rule.

Parameters:
  • negate – Whether to negate the scores so that higher values are better (default: True).

  • verbose – Whether to enable verbose logging (default: False).

negate = True
class src.pm_rank.model.SphericalScoringRule(verbose: bool = False)

Bases: ScoringRule

Spherical scoring rule for evaluating probabilistic forecasts.

The spherical scoring rule normalizes probability vectors to unit vectors and measures the cosine similarity with the actual outcome. This rule is less sensitive to extreme probability values compared to the logarithmic rule.

Parameters:

verbose – Whether to enable verbose logging (default: False).

Initialize the spherical scoring rule.

Parameters:

verbose – Whether to enable verbose logging (default: False).

class src.pm_rank.model.LogScoringRule(clip_prob: float = 0.01, verbose: bool = False)

Bases: ScoringRule

Logarithmic scoring rule for evaluating probabilistic forecasts.

The logarithmic scoring rule is a proper scoring rule that rewards forecasters based on the logarithm of their predicted probability for the actual outcome. This rule heavily penalizes overconfident predictions and rewards well-calibrated forecasts.

Parameters:
  • clip_prob – Minimum probability value to prevent log(0) (default: 0.01).

  • verbose – Whether to enable verbose logging (default: False).

Initialize the logarithmic scoring rule.

Parameters:
  • clip_prob – Minimum probability value to prevent log(0) (default: 0.01).

  • verbose – Whether to enable verbose logging (default: False).

clip_prob = 0.01
class src.pm_rank.model.AverageReturn(num_money_per_round: int = 1, risk_aversion: float = 0.0, use_approximate: bool = False, verbose: bool = False)

Average Return Model for ranking forecasters based on their expected market returns.

This class implements a ranking algorithm that evaluates forecasters based on how much money they could earn from prediction markets using different risk aversion strategies. The model calculates expected returns for each forecaster and ranks them accordingly.

Initialize the AverageReturn model.

Parameters:
  • num_money_per_round – Amount of money to bet per round (default: 1).

  • risk_aversion – Risk aversion parameter between 0 and 1 (default: 0.0).

  • verbose – Whether to enable verbose logging (default: False).

  • use_approximate – Whether to use the approximate CRRA betting strategy (default: False).

Raises:

AssertionError – If risk_aversion is not between 0 and 1.

num_money_per_round = 1
risk_aversion = 0.0
use_approximate = False
verbose = False
logger
fit(problems: List[pm_rank.data.base.ForecastProblem], include_scores: bool = True, include_per_problem_info: bool = False) Tuple[Dict[str, Any], Dict[str, int]] | Dict[str, int]

Fit the average return model to the given problems.

This method processes all problems at once and returns the final rankings based on average returns across all problems.

Parameters:
  • problems – List of ForecastProblem instances to process.

  • include_scores – Whether to include scores in the results (default: True).

  • include_per_problem_info – Whether to include per-problem info in the results (default: False).

Returns:

Ranking results, either as a tuple of (scores, rankings) or just rankings. If include_per_problem_info is True, returns a tuple of (scores, rankings, per_problem_info).

fit_stream(problem_iter: Iterator[List[pm_rank.data.base.ForecastProblem]], include_scores: bool = True) Dict[int, Tuple[Dict[str, Any], Dict[str, int]] | Dict[str, int]]

Fit the model to streaming problems and return incremental results.

This method processes problems as they arrive and returns rankings after each batch, allowing for incremental analysis of forecaster performance.

Parameters:
  • problem_iter – Iterator over batches of ForecastProblem instances.

  • include_scores – Whether to include scores in the results (default: True).

Returns:

Mapping of batch indices to ranking results.

fit_stream_with_timestamp(problem_time_iter: Iterator[Tuple[str, List[pm_rank.data.base.ForecastProblem]]], include_scores: bool = True) collections.OrderedDict

Fit the model to streaming problems with timestamps and return incremental results.

This method processes problems with associated timestamps and returns rankings after each batch, maintaining chronological order.

Parameters:
  • problem_time_iter – Iterator over (timestamp, problems) tuples.

  • include_scores – Whether to include scores in the results (default: True).

Returns:

Chronologically ordered mapping of timestamps to ranking results.

fit_by_category(problems: List[pm_rank.data.base.ForecastProblem], include_scores: bool = True, stream_with_timestamp: bool = False, stream_increment_by: Literal['day', 'week', 'month'] = 'day', min_bucket_size: int = 1) Tuple[Dict[str, Any], Dict[str, int]] | Dict[str, int]

Fit the average return model to the given problems by category.

This method processes all problems at once and returns the final rankings based on average returns across all problems.

Parameters:
  • problems – List of ForecastProblem instances to process.

  • include_scores – Whether to include scores in the results (default: True).

  • stream_with_timestamp – Whether to stream problems with timestamps (default: False).

  • stream_increment_by – The increment by which to stream problems (default: “day”).

  • min_bucket_size – The minimum number of problems to include in a bucket (default: 1).

src.pm_rank.model.spearman_correlation(rank_dict_a: Dict[str, int], rank_dict_b: Dict[str, int]) float

Compute the Spearman correlation between two rankings. Reference: https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient

src.pm_rank.model.kendall_correlation(rank_dict_a: Dict[str, int], rank_dict_b: Dict[str, int]) float

Compute the Kendall correlation between two rankings. Reference: https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient