dataeval.metrics.bias.label_parity¶
-
dataeval.metrics.bias.label_parity(expected_labels, observed_labels, num_classes=
None)¶ Calculate the chi-square statistic to assess the parity between expected and observed label distributions.
This function computes the frequency distribution of classes in both expected and observed labels, normalizes the expected distribution to match the total number of observed labels, and then calculates the chi-square statistic to determine if there is a significant difference between the two distributions.
- Parameters:¶
- expected_labels : ArrayLike¶
List of class labels in the expected dataset
- observed_labels : ArrayLike¶
List of class labels in the observed dataset
- num_classes : int or None, default None¶
The number of unique classes in the datasets. If not provided, the function will infer it from the set of unique labels in expected_labels and observed_labels
- Returns:¶
chi-squared score and :term`P-Value` of the test
- Return type:¶
- Raises:¶
ValueError – If expected label distribution is empty, is all zeros, or if there is a mismatch in the number of unique classes between the observed and expected distributions.
Note
Providing
num_classescan be helpful if there are classes with zero instances in one of the distributions.The function first validates the observed distribution and normalizes the expected distribution so that it has the same total number of labels as the observed distribution.
It then performs a Chi-Square Test of Independence to determine if there is a statistically significant difference between the observed and expected label distributions.
This function acts as an interface to the scipy.stats.chisquare method, which is documented at https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html
Examples
Randomly creating some label distributions using
np.random.default_rng>>> rng = np.random.default_rng(175) >>> expected_labels = rng.choice([0, 1, 2, 3, 4], (100)) >>> observed_labels = rng.choice([2, 3, 0, 4, 1], (100)) >>> label_parity(expected_labels, observed_labels) LabelParityOutput(score=14.007374204742625, p_value=0.0072715574616218)