dataeval.core.label_alignment

dataeval.core.label_alignment(source, target, *, matchers=(), threshold=0.0)

Align a source label vocabulary against a target ontology.

Establishes typed Correspondence objects from the source’s concepts to the target’s, in three passes: exact terminological anchoring (unique label/synonym/id matches become equivalent), any additional matchers for the concepts left unanchored, then structural propagation that coarsens each still-unanchored source up to its nearest equivalent-anchored ancestor (narrower). broader correspondences are added as diagnostics where an equivalent source spans several finer target concepts. The result reports the safe label rewrite (class_remap), the open-world unaligned concepts on each side, and a mergeability summary.

Parameters:
source : Ontology or Iterable[str]

The vocabulary to map from. A bare sequence of class names is treated as a structureless ontology (via Ontology.from_hierarchy()); in that case label_alignment reduces to label reconciliation plus structural inference against the target.

target : Ontology

The reference vocabulary to map to.

matchers : Iterable[Matcher], optional

Additional element-level matchers (e.g. a fuzzy string-similarity matcher) consulted for source concepts the exact pass did not anchor. Each must implement the Matcher protocol. Exact anchoring is always performed first.

threshold : float, optional

Minimum confidence for accepting a matcher’s proposal, in [0, 1]. Defaults to 0.0 (accept any proposal a matcher emits).

Returns:

Correspondences, unaligned concepts on each side, the carry-over class_remap, and the mergeability of the source into the target.

Return type:

LabelAlignmentResult

See also

dataeval.core.label_reconciliation

The restriction of alignment to a single structureless source and exact equivalence.

Notes

Acceptance favors precision over recall: a source concept with no unique exact match and no above-threshold, unambiguous matcher proposal is left unanchored (and only coarsened if it has an anchored ancestor), rather than committing a likely-wrong correspondence.