dataeval.utils.metadata.preprocess ================================== .. py:function:: dataeval.utils.metadata.preprocess(raw_metadata, class_labels, continuous_factor_bins = None, auto_bin_method = 'uniform_width', exclude = None) Restructures the metadata to be in the correct format for the bias functions. This identifies whether the incoming metadata is discrete or continuous, and whether the data is already binned or still needs binning. It accepts a list of dictionaries containing the per image metadata and automatically adjusts for multiple targets in an image. :param raw_metadata: Iterable collection of metadata dictionaries to flatten and merge. :type raw_metadata: Iterable[Mapping[str, Any]] :param class_labels: If arraylike, expects the labels for each image (image classification) or each object (object detection). If the labels are included in the metadata dictionary, pass in the key value. :type class_labels: ArrayLike or string :param continuous_factor_bins: User provided dictionary specifying how to bin the continuous metadata factors :type continuous_factor_bins: Mapping[str, int] or Mapping[str, list[tuple[TNum, TNum]]] or None, default None :param auto_bin_method: Method by which the function will automatically bin continuous metadata factors. It is recommended that the user provide the bins through the `continuous_factor_bins`. :type auto_bin_method: "uniform_width" or "uniform_count" or "clusters", default "uniform_width" :param exclude: User provided collection of metadata keys to exclude when processing metadata. :type exclude: Iterable[str] or None, default None :returns: Output class containing the binned metadata :rtype: Metadata