Rescalingο
- class irtorch.rescale.Scale(invertible: bool = False)ο
Bases:
ABCAbstract base class for Item Response Theory model scale transformations. All scale transformations should inherit from this class.
Note that you can make custom transformations by inheriting from this class. A class instance can then be supplied to
irtorch.models.BaseIRTModel.add_scale_transformation()to apply the transformation to the latent variables of the model.- abstract inverse(transformed_theta: Tensor) Tensorο
Puts the scores back to the original theta scale.
- Parameters
transformed_theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- Returns
A 2D tensor containing theta scores on the the original scale.
- Return type
torch.Tensor
- abstract jacobian(theta: Tensor) Tensorο
Computes the Jacobian matrix of the scale transformations for each row in the input theta scores.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- Returns
A torch tensor with the Jacobians for each theta score. Dimensions are (theta rows, latent variables, latent variables) where the last two are the jacobians for each row.
- Return type
torch.Tensor
- abstract transform(theta: Tensor) Tensorο
Transforms the input theta scores into the new scale.
- Parameters
theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- class irtorch.rescale.Bit(model: BaseIRTModel, population_theta: torch.Tensor | None = None, start_theta: torch.Tensor | None = None, items: list[int] | None = None, grid_points: int = 4000, mc_start_theta_approx: bool = False, **kwargs)ο
Bases:
ScaleBit scale transformation, as introduced by Wallmark and Wiberg [16].
- Parameters
model (BaseIRTModel) β The IRT model to use for bit scale computation.
population_theta (torch.Tensor, optional) β Theta scores from the population. Usually the training data. Used to find good starting values for the grid of theta scores, which are then used for the bit transformation. Recommended to use for models with theta distributions for which values far from 0 are common. (default is None)
start_theta (torch.Tensor, optional) β The starting theta scores for the bit scale computation. If None, the minimum theta scores are used. (default is None)
items (list[int], optional) β The item indices for the items to use to compute the bit scores. (default is None and uses all items)
grid_points (int, optional) β The number of points to use for computing bit score. More steps lead to more accurate results. (default is 4000)
mc_start_theta_approx (bool, optional) β For multiple choice models. Whether to approximate the starting theta scores using simulated random guesses. If True, runs
bit_score_starting_theta_mc(). (default is False)**kwargs β Additional keyword arguments for the starting theta approximation method. See
bit_score_starting_theta_mc().
Notes
First, item bit scores for each item \(j\) are computed from \(\mathbf{\theta}\) scores as follows:
\[\begin{equation} \begin{aligned} B_j(\mathbf{\theta})= \int_{t=\mathbf{\theta}^{(0)}}^{\mathbf{\theta}} \left|\frac{dH_j(t)}{dt}\right| dt. \end{aligned} \end{equation}\]where
\(\mathbf{\theta}^{(0)}\) is the minimum \(\mathbf{\theta}\)
\(H_j(\mathbf{\theta})\) is entropy for item \(j\) as a function of \(\mathbf{\theta}\)
The total bit scores \(B(\mathbf{\theta})\) are then the sum of the item scores:
\[\begin{equation} \begin{aligned} B(\mathbf{\theta}) = \sum_{j=1}^{J} B_j(\mathbf{\theta}). \end{aligned} \end{equation}\]Examples
>>> import irtorch >>> from irtorch.models import MonotoneNN >>> from irtorch.estimation_algorithms import MML >>> from irtorch.rescale import Bit >>> data, mc_correct = irtorch.load_dataset.swedish_sat_quantitative() >>> model = MonotoneNN(data, mc_correct=mc_correct) >>> model.fit(train_data=data, algorithm=MML()) >>> thetas = model.latent_scores(data) >>> # Initalize the scale transformation >>> # mc_start_theta_approx sets the starting theta to the approximate score of a randomly guessing respondent >>> bit = Bit(model, population_theta=thetas, mc_start_theta_approx=True) >>> # Supply the new scale to the model >>> model.add_scale_transformation(bit) >>> # Estimate thetas on the transformed scale >>> rescaled_thetas = model.latent_scores(data) >>> # Or alternatively by directly converting the old ones >>> rescaled_thetas = model.transform_theta(thetas) >>> # Plot the differences >>> model.plot.latent_score_distribution(thetas).show() >>> model.plot.latent_score_distribution(rescaled_thetas).show() >>> # Plot an item on the bit transformed scale >>> model.plot.item_probabilities(1).show()
- bit_score_starting_theta_mc(theta_estimation: str = 'ML', ml_map_device: str = 'cpu', lbfgs_learning_rate: float = 0.25, items: list[int] | None = None, guessing_probabilities: list[float] | None = None, guessing_iterations: int = 10000)ο
For multiple choice models, approximate the starting theta score \(\mathbf{\theta}^{(0)}\) from which to compute bit scores. See notes under
bit_scores()for the bit score formula.- Parameters
theta_estimation (str, optional) β Method used to obtain the theta scores. Can be βNNβ, βMLβ, βEAPβ or βMAPβ for neural network, maximum likelihood, expected a posteriori or maximum a posteriori respectively. (default is βMLβ)
ml_map_device (str, optional) β For ML and MAP. The device to use for computation. Can be βcpuβ or βcudaβ. (default is βcudaβ if available else βcpuβ)
lbfgs_learning_rate (float, optional) β For ML and MAP. The learning rate to use for the LBFGS optimizer. (default is 0.3)
items (list[int], optional) β The item indices for the items to use to compute the bit scores. (default is None and uses all items)
guessing_probabilities (list[float], optional) β The guessing probability for each item. The same length as the number of items. Guessing is not supported for polytomously scored items and the probabilities for them will be ignored. (default is None and uses no guessing or, for multiple choice models, 1 over the number of item categories)
guessing_iterations (int, optional) β The number of iterations to use for approximating a minimum theta when guessing is incorporated. (default is 200)
- Returns
A tensor with all the starting theta values.
- Return type
torch.Tensor
- inverse(transformed_theta)ο
Puts the scores back to the original theta scale.
- Parameters
transformed_theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- Returns
A 2D tensor containing theta scores on the the original scale.
- Return type
torch.Tensor
- jacobian(theta: Tensor) Tensorο
Computes the gradients of the bit scores with respect to the input theta scores.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- Returns
A torch tensor with the gradients for each theta score. Dimensions are (theta rows, latent variables, latent variables) where the last two are the jacobians.
- Return type
torch.Tensor
- set_start_theta(start_theta: Tensor)ο
Sets the starting theta scores for the bit scale computation.
- Parameters
start_theta (torch.Tensor) β The starting theta scores for the bit scale computation.
- transform(theta: Tensor) Tensorο
Transforms \(\mathbf{\theta}\) scores into bit scores \(B(\mathbf{\theta})\).
- Parameters
theta (torch.Tensor) β A 2D tensor. Columns are latent variables and rows are respondents.
- Returns
A 2D tensor with bit score scale scores for each respondent across the rows together with another tensor with start_theta.
- Return type
torch.Tensor
- transform_to_1D(theta: Tensor) Tensorο
Transforms \(\mathbf{\theta}\) scores of a multi-dimensional model into one-dimensional bit scores \(B(\mathbf{\theta})\). Equivalent to
transform()for one-dimensional models.- Parameters
theta (torch.Tensor) β A 2D tensor. Columns are latent variables and rows are respondents.
- Returns
A 2D tensor with bit score scale scores for each respondent across the rows together with another tensor with start_theta.
- Return type
torch.Tensor
- class irtorch.rescale.Flow(latent_variables: int)ο
Bases:
ScaleNormalizing flow transformation of IRT theta scales using rational quadratic splines as per Durkan et al. [7]. Supports gradient computation and the transformation is invertible.
- Parameters
latent_variables (int) β The number of latent variables.
Examples
>>> import irtorch >>> from irtorch.models import GradedResponse >>> from irtorch.estimation_algorithms import MML >>> from irtorch.rescale import Flow >>> data = irtorch.load_dataset.swedish_national_mathematics_1() >>> model = GradedResponse(data) >>> model.fit(train_data=data, algorithm=MML()) >>> thetas = model.latent_scores(data) >>> # Initalize and fit the flow scale transformation. Supply it to the model. >>> flow = Flow(1) >>> flow.fit(thetas) >>> model.add_scale_transformation(flow) >>> # Estimate thetas on the transformed scale >>> rescaled_thetas = model.latent_scores(data) >>> # Or alternatively by directly converting the old ones >>> rescaled_thetas = model.transform_theta(thetas) >>> # Plot the differences >>> model.plot.latent_score_distribution(thetas).show() >>> model.plot.latent_score_distribution(rescaled_thetas).show() >>> # Put the thetas back to the original scale >>> original_thetas = model.inverse_transform_theta(rescaled_thetas) >>> # Plot an item on the flow transformed scale >>> model.plot.item_probabilities(1).show()
- fit(theta: Tensor, transformation: irtorch.torch_modules.rational_quadratic_spline.RationalQuadraticSpline | None = None, distribution: torch.distributions.distribution.Distribution | None = None, batch_size: int | None = None, learning_rate: float = 0.01, learning_rate_updates_before_stopping: int = 2, evaluation_interval_size: int = 50, max_epochs: int = 1500, device: str = 'cpu', **kwargs)ο
Fits the normalizing flow to the data. Typically used from within an IRT model instance. Use batch_size if the data is too large to fit in memory.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores of the population. Usually the training data. Each column represents one latent variable.
transformation (RationalQuadraticSpline, optional) β The transformation to apply to the data.
distribution (Distribution, optional) β The distribution to apply to the latent variables. If None, a standard normal distribution is used.
batch_size (int, optional) β The batch size for the data loader. (default is None and uses the full dataset)
learning_rate (float, optional) β The learning rate for the optimizer. (default is 0.01)
learning_rate_updates_before_stopping (int, optional) β The number of learning rate updates before stopping the training. (default is 2)
evaluation_interval_size (int, optional) β The number of iterations between each model evaluation during training. (default is 50)
max_epochs (int, optional) β The maximum number of epochs to train the flow. (default is 1500)
device (str, optional) β The device to use for the computation. (default is βcudaβ if available, otherwise βcpuβ)
**kwargs β Additional keyword arguments for
irtorch.torch_modules.RationalQuadraticSplineconstructor. By default, the spline is set to have 50 bins and the input bounds are set to -5.5 and 5.5 and output bounds to -3.0 and 3.0.
- inverse(transformed_theta: Tensor) Tensorο
Puts the scores back to the original theta scale.
- Parameters
transformed_theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- Returns
A 2D tensor containing theta scores on the the original scale.
- Return type
torch.Tensor
- jacobian(theta: Tensor) Tensorο
Computes the Jacobian of scale scores for each \(j\) with respect to the input theta scores.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- Returns
A tensor with the Jacobian for each input row. Dimensions are (theta rows, latent variables, latent variables) where the last two are the jacobians.
- Return type
torch.Tensor
- transform(theta: Tensor) Tensorο
Transforms the input theta scores into the new scale.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- class irtorch.rescale.LinkCommonItems(model_from: BaseIRTModel, model_to: BaseIRTModel, model_from_common_item_indices: list[int], model_to_common_item_indices: list[int], method: str = 'spline', inverted: bool = False, **kwargs)ο
Bases:
ScaleLink theta scales from two different IRT models to the same scale using common (anchor) items. Either rational quadratic splines [7] or monotonic neural networks [12] can be used to link the scales. Currently only supports unidimensional models.
- Parameters
model_from (BaseIRTModel) β The IRT model which scale to transform.
model_to (BaseIRTModel) β The scale of model_from will be linked to the scale of model_to.
model_from_common_item_indices (list[int]) β The indices of the items in model_from that are also in model_to (first item is index 0).
model_to_common_item_indices (list[int]) β The indices of the items in model_to that are also in model_from (first item is index 0).
method (str, optional) β The method to use for linking the scales. Either βsplineβ or βneuralnetβ. Default is βsplineβ. Note that the splines uses a fixed range of -5.5 to 5.5 for input values and a learned output range with initial values of -5.5 to 5.5. If latent scores are outside this range are common for your models, you may need to adjust the bounds. See
irtorch.torch_modules.RationalQuadraticSplinefor more information.inverted (bool, optional) β Set to true if the theta scale of one model is inverted. Default is False.
**kwargs β Additional keyword arguments for
irtorch.torch_modules.RationalQuadraticSplineconstructor when method is βsplineβ. When method is βneuralnetβ, the number of neurons in the hidden layer can be set with the neurons argument. Note that the number of neurons must be divisible by 3. Default is 9. By default, the spline is set to have 50 bins and the input bounds are set to -5.5 and 5.5 and output bounds to -3.0 and 3.0.
Notes
We have two models fitted using data from two different populations, P and Q. We also have some items in common between the models for the purpose of linking. Let \(\theta_P\) and \(\theta_Q\) be points from the latent trait scales from the models fitted to P and Q respectively. Our goal is to find a linking function \(g\left(\theta_P\right)\) which takes a \(\theta_P\) and outputs the equivalent \(\theta_Q\). This is done by finding the \(g\left(\theta_P\right)\) that minimizes the KL divergence between the transformed and linked item curves.
\[\int \sum_{j \in \text { common }} \sum_x D_{K L}\left[P_P\left(X_j=x|\theta_P\right) \Vert P_Q \left(X_j=x| \theta_Q = g\left(\theta_P \right) \right)\right] f(\theta_P) d \theta_P\]\(P_P\left(X_j=x|\theta_P\right)\) is the probability for a score \(x\) on item \(j\) from the model fitted to population P given \(\theta_P\).
\(P_Q \left(X_j=x| \theta_Q = g\left(\theta_P \right)\right)\) is the probability for a score \(x\) on item \(j\) from the model fitted to population Q given \(\theta_Q = g\left(\theta_P \right)\).
\(f(\theta_P)\) is the density of the latent trait distribution in population P.
The sums are over the common items and their possible responses.
Examples
>>> import irtorch >>> from irtorch.rescale import LinkCommonItems >>> from irtorch.models import ThreeParameterLogistic >>> from irtorch.estimation_algorithms import JML, MML >>> data = irtorch.load_dataset.swedish_sat_binary()[:, :80] >>> # As an illustration, we split the dataset into two parts and use 20 common items. >>> # In practice, we would of course use different datasets for each model. >>> data1 = data[:2500, :50] >>> data2 = data[2500:, 30:] >>> model1 = ThreeParameterLogistic(items=50) >>> model2 = ThreeParameterLogistic(items=50) >>> model1.fit(train_data=data1, algorithm=MML()) >>> model2.fit(train_data=data2, algorithm=JML()) >>> # Link the scale of model 2 to the model 1 scale using common items. >>> link = LinkCommonItems(model2, model1, list(range(20)), list(range(30, 50))) >>> link.fit(theta_from = model2.latent_scores(data2), learning_rate=0.01, max_epochs=1000) >>> model2.add_scale_transformation(link) >>> # Plot the transformation >>> model2.plot.scale_transformations(input_theta_range=(-5, 5)).show()
- fit(theta_from: Tensor, batch_size: int | None = None, learning_rate: float = 0.01, learning_rate_updates_before_stopping: int = 1, evaluation_interval_size: int = 50, max_epochs: int = 1000, device: str = 'cpu')ο
Fits the normalizing flow to the data. Typically used from within an IRT model instance. Use batch_size if the data is too large to fit in memory.
- Parameters
theta_from (torch.Tensor) β A 2D tensor containing latent variable theta scores from the model which theta scale we are transforming (model_from). Usually the training data and respresents the population. Each column represents one latent variable.
batch_size (int, optional) β The batch size for the data loader. Default is None and uses no batches.
learning_rate (float, optional) β The learning rate for the optimizer. Default is 0.1.
learning_rate_updates_before_stopping β The number of learning rate updates before stopping the training. Default is 1.
optional β The number of learning rate updates before stopping the training. Default is 1.
evaluation_interval_size (int, optional) β The number of iterations between each model evaluation during training. (default is 50)
max_epochs (int, optional) β The maximum number of epochs to train the flow. Default is 1000.
device (str, optional) β The device to use for the computation. Default is βcudaβ if available, otherwise βcpuβ.
- inverse(transformed_theta: Tensor) Tensorο
Puts the scores back to the original theta scale.
- Parameters
transformed_theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- Returns
A 2D tensor containing theta scores on the the original scale.
- Return type
torch.Tensor
- jacobian(theta: Tensor) Tensorο
Computes the Jacobian of scale scores for each \(j\) with respect to the input theta scores.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- Returns
A tensor with the Jacobian for each input row. Dimensions are (theta rows, latent variables, latent variables) where the last two are the jacobians.
- Return type
torch.Tensor
- transform(theta: Tensor) Tensorο
Transforms the input theta scores into the new scale.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- class irtorch.rescale.RankCDF(theta: Tensor, distributions: list[torch.distributions.distribution.Distribution] | None = None)ο
Bases:
ScaleRank-based inverse CDF transformation of IRT theta scales.
For each latent variable, finds the rank of each theta score in the population. For new data, finds the closest matching population ranks and uses the inverse CDF to find the equivalents scores in the chosen distribution(s).
Note that while this method is fast, it is heavily reliant on the input theta scores covering the entire range of the distribution(s). It is also not invertible, does not support gradient computation and the transformation is not unique one-to-one.
- Parameters
theta (torch.Tensor) β A large tensor of theta scores representing the population.
distributions (list[torch.distributions.Distribution], optional) β The distributions to use for the transformation of each latent variable. If None, normal distributions are used.
Examples
>>> import irtorch >>> from irtorch.models import GradedResponse >>> from irtorch.estimation_algorithms import MML >>> from irtorch.rescale import RankCDF >>> data = irtorch.load_dataset.swedish_national_mathematics_1() >>> model = GradedResponse(data) >>> model.fit(train_data=data, algorithm=MML()) >>> thetas = model.latent_scores(data) >>> # Create and RankCDF instancce and supply it to the model. >>> model.add_scale_transformation(RankCDF(thetas)) >>> # Estimate thetas on the transformed scale >>> rescaled_thetas = model.latent_scores(data) >>> # Or alternatively by directly converting the old ones >>> rescaled_thetas = model.transform_theta(thetas) >>> # Plot the differences >>> model.plot.latent_score_distribution(thetas).show() >>> model.plot.latent_score_distribution(rescaled_thetas).show() >>> # Plot an item on the transformed scale >>> model.plot.item_probabilities(1).show()
- inverse(transformed_theta)ο
Puts the scores back to the original theta scale.
- Parameters
transformed_theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- Returns
A 2D tensor containing theta scores on the the original scale.
- Return type
torch.Tensor
- jacobian(theta: Tensor) Tensorο
Computes the gradients of scale scores with respect to the input theta scores.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- Returns
A torch tensor with the gradients for each theta score. Dimensions are (theta rows, latent variables, latent variables) where the last two are the jacobians.
- Return type
torch.Tensor
- transform(theta: Tensor) Tensorο
Transforms the input theta scores into the new scale.
- Parameters
theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- Returns
A 2D tensor containing the transformed theta scores.
- Return type
torch.Tensor
- class irtorch.rescale.Reverse(reversed_latent_variables: list[bool])ο
Bases:
ScaleReverses the chosen theta scales using.
- Parameters
reversed_latent_variables (list[bool]) β A list of booleans indicating which latent variables to reverse.
Examples
>>> import irtorch >>> from irtorch.models import NominalResponse, MonotoneNN >>> from irtorch.estimation_algorithms import AE, MML >>> from irtorch.rescale import Reverse >>> irtorch.set_seed(15) >>> data_sat, correct_responses = irtorch.load_dataset.swedish_sat_verbal() >>> model = NominalResponse(data=data_sat, mc_correct=correct_responses) >>> model.fit(train_data=data_sat, algorithm=MML()) >>> model.plot.item_probabilities(1).show() >>> # reverse the first (and only) latent variable >>> reverse = Reverse([True]) >>> model.add_scale_transformation(reverse) >>> model.plot.item_probabilities(1).show()
- inverse(transformed_theta: Tensor) Tensorο
Puts the scores back to the original theta scale.
- Parameters
transformed_theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- Returns
A 2D tensor containing theta scores on the the original scale.
- Return type
torch.Tensor
- jacobian(theta: Tensor) Tensorο
Computes the gradients of scale scores for each \(j\) with respect to the input theta scores.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- Returns
A torch tensor with the gradients for each theta score. Dimensions are (theta rows, latent variables, latent variables) where the last two are the jacobians.
- Return type
torch.Tensor
- transform(theta: Tensor) Tensorο
Transforms the input theta scores into the new scale.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.
- class irtorch.rescale.Rotate(model: BaseIRTModel, data: torch.Tensor | None = None, theta: torch.Tensor | None = None, loadings: torch.Tensor | None = None, rotation_method: str = 'promax', rotation_matrix: torch.Tensor | None = None, **kwargs)ο
Bases:
ScaleRotates the latent variables to improve interpretability. Utilizes the factor_analyzer package for rotations.
If the model has already been rescaled using
irtorch.rescale()the rotation is applied to the rescaled latent variables. If you do not want this, useirtorch.models.BaseIRTModel.detach_rescale()before applying the rotation.- Parameters
model (BaseIRTModel) β The IRT model which scales to rotate.
data (torch.Tensor, optional) β The popluation data to compute the latent variable βloadingsβ for each item.
theta (torch.Tensor, optional) β The original scale theta scores to compute the latent variable βloadingsβ for each item.
loadings (torch.Tensor, optional) β A torch tensor with the loadings for each item. If specified, data and theta are ignored and the loadings are used for rotation. (default is None)
rotation_method (str, optional) β
The rotation method to use. For available options, see factor_analyzer. (default is βpromaxβ)
rotation_matrix (torch.Tensor, optional) β A torch tensor with the rotation matrix. If specified, data, theta and rotation_method are ignored and the rotation matrix is used directly. (default is None)
**kwargs β Additional keyword arguments used for theta estimation. Refer to
irtorch.models.BaseIRTModel.latent_scores()for additional details.
- inverse(transformed_theta: Tensor) Tensorο
Puts the scores back to the original theta scale.
- Parameters
transformed_theta (torch.Tensor) β A 2D tensor containing transformed theta scores. Each column represents one latent variable.
- Returns
A 2D tensor containing theta scores on the the original scale.
- Return type
torch.Tensor
- jacobian(theta: Tensor) Tensorο
Computes the gradients of rotated scores for each \(j\) with respect to the original theta scores.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores on the original theta scale. Each column represents one latent variable.
- Returns
A torch tensor with the gradients for each theta score. Dimensions are (theta rows, latent variables, latent variables) where the last two are the jacobians.
- Return type
torch.Tensor
- transform(theta: Tensor) Tensorο
Transforms the input theta scores into the new scale.
- Parameters
theta (torch.Tensor) β A 2D tensor containing latent variable theta scores. Each column represents one latent variable.