dataeval.core.label_alignment¶

dataeval.core.label_alignment(source, target, *, matchers=(), threshold=0.0)¶

Align a source label vocabulary against a target ontology.

Establishes typed Correspondence objects from the source’s concepts to the target’s, in three passes: exact terminological anchoring (unique label/synonym/id matches become equivalent), any additional matchers for the concepts left unanchored, then structural propagation that coarsens each still-unanchored source up to its nearest equivalent-anchored ancestor (narrower). broader correspondences are added as diagnostics where an equivalent source spans several finer target concepts. The result reports the safe label rewrite (class_remap), the open-world unaligned concepts on each side, and a mergeability summary.

Parameters:¶

source : Ontology or Iterable[str]¶: The vocabulary to map from. A bare sequence of class names is treated as a structureless ontology (via Ontology.from_hierarchy()); in that case label_alignment reduces to label reconciliation plus structural inference against the target.
target : Ontology¶: The reference vocabulary to map to.
matchers : Iterable[Matcher], optional¶: Additional element-level matchers (e.g. a fuzzy string-similarity matcher) consulted for source concepts the exact pass did not anchor. Each must implement the Matcher protocol. Exact anchoring is always performed first.
threshold : float, optional¶: Minimum confidence for accepting a matcher’s proposal, in [0, 1]. Defaults to 0.0 (accept any proposal a matcher emits).

Returns:¶

Correspondences, unaligned concepts on each side, the carry-over class_remap, and the mergeability of the source into the target.

Return type:¶

LabelAlignmentResult