dataeval.metadata.most_deviated_factors¶
- dataeval.metadata.most_deviated_factors(metadata_1, metadata_2, ood)¶
Determines greatest deviation in metadata features per out of distribution sample in metadata_2.
- Parameters:¶
- metadata_1 : Metadata¶
A reference set of Metadata containing factor names and samples with discrete and/or continuous values per factor
- metadata_2 : Metadata¶
The set of Metadata that is tested against the reference metadata. This set must have the same number of features but does not require the same number of samples.
- ood : OODOutput¶
A class output by the DataEval’s OOD functions that contains which examples are OOD.
- Returns:¶
An array of the factor name and deviation of the highest metadata deviation for each OOD example in metadata_2.
- Return type:¶
list[tuple[str, float]]
Notes
Both
Metadatainputs must have discrete and continuous data in the shape (samples, factors) and have equivalent factor names and lengthsThe flag at index i in
OODOutput.is_oodmust correspond directly to sample i of metadata_2 being out-of-distribution from metadata_1