dataeval.metrics.bias.balance¶
-
dataeval.metrics.bias.balance(metadata, num_neighbors=
5)¶ Mutual information (MI) between factors (class label, metadata, label/image properties).
- Parameters:¶
- Returns:¶
(num_factors+1) x (num_factors+1) estimate of mutual information between num_factors metadata factors and class label. Symmetry is enforced.
- Return type:¶
Note
We use mutual_info_classif from sklearn since class label is categorical. mutual_info_classif outputs are consistent up to O(1e-4) and depend on a random seed. MI is computed differently for categorical and continuous variables.
Example
Return balance (mutual information) of factors with class_labels
>>> bal = balance(metadata) >>> bal.balance array([1. , 0.134, 0. , 0. ])Return intra/interfactor balance (mutual information)
>>> bal.factors array([[1. , 0. , 0.015], [0. , 0.08 , 0.011], [0.015, 0.011, 1.063]])Return classwise balance (mutual information) of factors with individual class_labels
>>> bal.classwise array([[1. , 0.134, 0. , 0. ], [1. , 0.134, 0. , 0. ]])See also
sklearn.feature_selection.mutual_info_classif,sklearn.feature_selection.mutual_info_regression,sklearn.metrics.mutual_info_score