Parity

How-To Guides

Check out this how to to begin using the Parity class

Class Label Analysis Tutorial

DataEval API

class dataeval.metrics.Parity

Class for evaluating statistics of observed and expected class labels, including:

  • Chi Squared test for statistical independence between expected and observed labels

evaluate(expected_labels: ndarray, observed_labels: ndarray, num_classes: int | None = None) Tuple[float64, float64]

Perform a one-way chi-squared test between observation frequencies and expected frequencies that tests the null hypothesis that the observed data has the expected frequencies.

This function acts as an interface to the scipy.stats.chisquare method, which is documented at https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html

Parameters:
  • expected_labels (np.ndarray) – List of class labels in the expected dataset

  • observed_labels (np.ndarray) – List of class labels in the observed dataset

  • num_classes (Optional[int]) – The number of unique classes in the datasets. If this is not specified, it will be inferred from the set of unique labels in expected_labels and observed_labels

Returns:

  • np.float64 – chi-squared value of the test

  • np.float64 – p-value of the test

Raises:

ValueError – If x is empty