dataeval.data.Relabel

class dataeval.data.Relabel(class_remap, target=None, *, on_unmatched='drop')

Conform a dataset’s class labels to a target vocabulary via a class mapping.

Rewrites each datum’s integer labels from the source vocabulary into the target vocabulary using a class_remap (source class name to target concept), and replaces the dataset’s index2label with the target’s. The class_remap is typically the remap of a label_alignment() result, but may be any hand-written mapping — equivalences are renamed and coarsenings collapse, so two source classes may map to one target class. Source classes with no entry in class_remap (or whose target is absent from target) are out-of-vocabulary; by default they are dropped.

Parameters:
class_remap : Mapping[str, str]

Maps a source class name to its target concept. Target values are concept ids when target is an Ontology (ids equal labels for hand-built ontologies), otherwise target labels.

target : Ontology or Mapping[int, str] or Sequence[str], optional

The target vocabulary and its integer indexing: an Ontology (concepts indexed in order), an index -> label mapping, or an ordered sequence of class names. A plain mapping/sequence needs no ontology, so relabeling can be done entirely by hand. If omitted, the vocabulary is derived from the distinct class_remap values (first-seen order) — handy for one-off maps. To merge several datasets, pass the same explicit target so they share an indexing.

on_unmatched : {"drop", "raise"}, default "drop"

What to do with out-of-vocabulary source classes. "drop" removes them (an image-classification datum whose class is OOV is dropped; an object-detection detection that is OOV is dropped, and an image left with no detections is dropped). "raise" raises if any source class is OOV.

Raises:

OntologyError – If the dataset metadata provides no index2label, or if on_unmatched="raise" and any source class is out-of-vocabulary.

conform_datum(datum)

Return the transformed datum (default: unchanged).

conform_metadata(metadata)

Return possibly-updated dataset-level metadata (default: unchanged).

keeps(datum)

Return whether datum survives this conformer (default: always).

property dropped : collections.abc.Mapping[int, str]

Source classes dropped as out-of-vocabulary (source index to name).

property mapping : collections.abc.Mapping[int, int]

Source label index to target label index (computed during conform).