dataeval.metrics.estimators.null_model_metrics

dataeval.metrics.estimators.null_model_metrics(test_labels, train_labels=None)

Calculate null model metrics (dummy classifiers metrics) for given class distributions.

This function calculates benchmark performance metrics for random classifiers on the training and testing labels based on the class distributions.

Null models to be evaluated:

  • Uniform Random: Classifier applies equal probability to each class

  • Dominant Class: Classifier will choose the most frequent class in the training set (requires training labels)

  • Proportional Random: Classifier applies distribution probabilities from training set (requires training labels)

The calculated metrics are to be used as a lower-bound performance baseline for model evaluation.

Parameters:
test_labels : ArrayLike

Class distribution from test set. Each index is the integer representation of the associated class label, e.g. [0, 1, 1, 2, 3].

train_labels : ArrayLike | None, default None

Class distribution from training set. Each index is the integer representation of the associated class label, e.g. [0, 1, 1, 2, 3]. When None, skips calculating class frequencies and does not report metrics for the dominant class and proportional random models.

Raises:

ValueError – If test_labels is None or empty

Returns:

Output class mapping null model metrics with null models

Return type:

NullModelMetricsOutput