src.pm_rank.model.irt

Pyro-based Item Response Theory (IRT) Models for Ranking Forecasters.

This module implements Item Response Theory models using Pyro for probabilistic programming. IRT models are used to estimate latent abilities of forecasters and difficulty/discrimination parameters of prediction problems based on their performance patterns.

The module provides two inference methods:

  • SVI (Stochastic Variational Inference): Fast approximate inference using variational methods

  • NUTS (No-U-Turn Sampler): Exact inference using Hamiltonian Monte Carlo sampling

Key Concepts:

  • Item Response Theory: A psychometric framework that models the relationship between a person’s latent ability and their probability of answering items correctly.

  • Forecaster Ability (θ): Latent parameter representing each forecaster’s skill level.

  • Problem Difficulty (b): Parameter representing how difficult each prediction problem is.

  • Problem Discrimination (a): Parameter representing how well each problem distinguishes between forecasters of different abilities.

  • Category Parameters (p): Parameters for the discretized scoring bins used in the model.

Reference: https://en.wikipedia.org/wiki/Item_response_theory

Classes

IRTModel

Item Response Theory model for ranking forecasters using Pyro.

MCMCConfig

Configuration for MCMC (Markov Chain Monte Carlo) inference using NUTS sampler.

SVIConfig

Configuration for SVI (Stochastic Variational Inference) optimization.

IRTObs

An internal, helper class that handles the transformation of data from the ForecastProblem level to an internal format

Package Contents

class src.pm_rank.model.irt.IRTModel(n_bins: int = 6, use_empirical_quantiles: bool = False, verbose: bool = False)

Bases: object

Item Response Theory model for ranking forecasters using Pyro.

This class implements an IRT model that estimates latent abilities of forecasters and difficulty/discrimination parameters of prediction problems. The model uses discretized scoring bins and supports both SVI and MCMC inference methods.

The IRT model assumes that the probability of a forecaster achieving a certain score on a problem depends on their latent ability (θ), the problem’s difficulty (b), the problem’s discrimination (a), and category parameters (p) for the scoring bins.

Parameters:
  • n_bins – Number of bins for discretizing continuous scores (default: 6).

  • use_empirical_quantiles – Whether to use empirical quantiles for binning instead of uniform bins (default: False).

  • verbose – Whether to enable verbose logging (default: False).

Initialize the IRT model.

Parameters:
  • n_bins – Number of bins for discretizing continuous scores (default: 6).

  • use_empirical_quantiles – Whether to use empirical quantiles for binning instead of uniform bins (default: False).

  • verbose – Whether to enable verbose logging (default: False).

n_bins = 6
use_empirical_quantiles = False
irt_obs = None
method = None
verbose = False
logger
fit(problems: List[pm_rank.data.base.ForecastProblem], include_scores: bool = True, method: Literal['SVI', 'NUTS'] = 'SVI', config: MCMCConfig | SVIConfig | None = None) Tuple[Dict[str, Any], Dict[str, int]] | Dict[str, int]

Fit the IRT model to the given problems and return rankings.

This method fits the IRT model using either SVI or MCMC inference, depending on the specified method. The model estimates latent abilities for each forecaster and difficulty/discrimination parameters for each problem.

Parameters:
  • problems – List of ForecastProblem instances to fit the model to.

  • include_scores – Whether to include scores in the results (default: True).

  • method – Inference method to use (“SVI” for fast approximate inference or “NUTS” for exact MCMC inference) (default: “SVI”).

  • config – Configuration object for the chosen inference method. Must be MCMCConfig for “NUTS” or SVIConfig for “SVI”.

Returns:

If include_scores=True, returns a tuple of (scores_dict, rankings_dict). If include_scores=False, returns only rankings_dict. scores_dict maps forecaster IDs to their estimated abilities. rankings_dict maps forecaster IDs to their ranks (1-based).

Raises:

AssertionError if method is invalid or config is not provided.

get_problem_level_parameters() Tuple[Dict[str, float], Dict[str, float]]

Get problem difficulty and discrimination parameters.

Returns the estimated difficulty and discrimination parameters for each problem after the model has been fitted. These parameters provide insights into how challenging each problem is and how well it distinguishes between forecasters of different abilities.

Returns:

A tuple of (difficulties_dict, discriminations_dict). difficulties_dict maps problem IDs to their difficulty parameters (b). discriminations_dict maps problem IDs to their discrimination parameters (a).

Raises:

AssertionError if the model has not been fitted yet.

class src.pm_rank.model.irt.MCMCConfig

Bases: pydantic.BaseModel

Configuration for MCMC (Markov Chain Monte Carlo) inference using NUTS sampler.

This configuration class defines parameters for running Hamiltonian Monte Carlo sampling with the No-U-Turn Sampler (NUTS) algorithm, which provides exact posterior inference for the IRT model.

Parameters:
  • total_samples – The total number of samples to draw from the posterior distribution (default: 1000).

  • warmup_steps – The number of warmup steps to run before sampling (default: 100).

  • num_workers – The number of workers to use for parallelization. Note that we use a customized multiprocessing approach since the default implementation by Pyro can be very slow. This is why we don’t use the name num_chains (default: 1).

  • device – The device to use for the MCMC engine (“cpu” or “cuda”) (default: “cpu”).

  • save_result – Whether to save the result to a file (default: False).

total_samples: int
warmup_steps: int
num_workers: int
device: Literal['cpu', 'cuda']
save_result: bool
class src.pm_rank.model.irt.SVIConfig

Bases: pydantic.BaseModel

Configuration for SVI (Stochastic Variational Inference) optimization.

This configuration class defines parameters for running variational inference using stochastic gradient descent, which provides fast approximate posterior inference for the IRT model.

Parameters:
  • optimizer – The optimizer to use for the SVI engine (“Adam” or “SGD”) (default: “Adam”).

  • num_steps – The number of steps to run for the SVI engine (default: 1000).

  • learning_rate – The learning rate to use for the SVI engine (default: 0.01).

  • device – The device to use for the SVI engine (“cpu” or “cuda”) (default: “cpu”).

optimizer: Literal['Adam', 'SGD']
num_steps: int
learning_rate: float
device: Literal['cpu', 'cuda']
class src.pm_rank.model.irt.IRTObs

An internal, helper class that handles the transformation of data from the ForecastProblem level to an internal format

The forecaster_id_to_idx and problem_id_to_idx are used to map the forecaster and problem ids to indices. This is useful for the pyro library, which requires the data to be in a certain format.

Parameters:
  • forecaster_ids – A tensor of shape (k,) with the forecaster ids

  • problem_ids – A tensor of shape (k,) with the problem ids

  • forecaster_id_to_idx – A dictionary with the forecaster ids as keys and the indices as values

  • problem_id_to_idx – A dictionary with the problem ids as keys and the indices as values

  • scores – A tensor of shape (k,) with the scores of the forecasts (discretized from scoring rules)

  • discretized_scores – A tensor of shape (k,) with the discretized scores of the forecasts

  • anchor_points – A tensor of shape (n_bins,) with the anchor points of the discretized scores

forecaster_ids: torch.Tensor
problem_ids: torch.Tensor
forecaster_id_to_idx: Dict[str, int]
problem_id_to_idx: Dict[str, int]
scores: torch.Tensor
discretized_scores: torch.Tensor
anchor_points: torch.Tensor
property forecaster_idx_to_id: Dict[int, str]
property problem_idx_to_id: Dict[int, str]