dataeval.utils.metadata.preprocess¶
-
dataeval.utils.metadata.preprocess(metadata, class_labels, continuous_factor_bins=
None, auto_bin_method='uniform_width', exclude=None, include=None, image_index_key='_image_index')¶ Restructures the metadata to be in the correct format for the bias functions.
This identifies whether the incoming metadata is discrete or continuous, and whether the data is already binned or still needs binning. It accepts a list of dictionaries containing the per image metadata and automatically adjusts for multiple targets in an image.
- Parameters:¶
- metadata : Mapping[str, list[Any] or NDArray[Any]]¶
A flat dictionary which contains all of the metadata on a per image (classification) or per object (object detection) basis. Length of lists/array should match the length of the label list/array.
- class_labels : ArrayLike or string¶
If arraylike, expects the labels for each image (image classification) or each object (object detection). If the labels are included in the metadata dictionary, pass in the key value.
- continuous_factor_bins : Mapping[str, int or Iterable[float]] or None, default None¶
User provided dictionary specifying how to bin the continuous metadata factors where the value is either an int to represent the number of bins, or a list of floats representing the edges for each bin.
- auto_bin_method : "uniform_width" or "uniform_count" or "clusters", default "uniform_width"¶
Method by which the function will automatically bin continuous metadata factors. It is recommended that the user provide the bins through the continuous_factor_bins.
- exclude : Iterable[str] or None, default None¶
User provided collection of metadata keys to exclude when processing metadata. Not to be used in conjunction with include.
- include : Iterable[str] or None, default None¶
User provided collection of metadata keys to include when processing metadata. Not to be used in conjunction with exclude.
- image_index_key : str, default "_image_index"¶
User provided metadata key which maps the metadata entry to the source image.
- Returns:¶
Output class containing the binned metadata
- Return type:¶
See also