dataeval.protocols.Matcher¶
- class dataeval.protocols.Matcher¶
Protocol for an element-level matcher used in ontology alignment.
A matcher proposes candidate
Correspondenceobjects between a source and a target vocabulary, each supplied as an iterable ofOntologyConcept(anOntologysatisfies this directly). It is the extension seam ofdataeval.core.label_alignment(): exact terminological anchoring is built intolabel_alignmentitself, while additional matchers (string-similarity, and later embedding / instance-based matchers) are supplied via itsmatchersargument and consulted for source concepts the exact pass left unanchored.A matcher needs only each concept’s
id,label, andsynonyms(parentsis available for light structural hints) so it can be written and tested against a plain list of concepts.Implementations should be permissive — propose any plausible correspondence with a calibrated
confidence— and letlabel_alignmentapply the acceptance threshold and pick the best proposal per source concept. A matcher need not deduplicate or resolve conflicts itself.Example
A trivial matcher proposing an equivalence for an exact id match:
>>> from dataeval.types import Correspondence >>> from dataeval.protocols import Matcher >>> >>> class IdMatcher: ... def __call__(self, source, target): ... target_ids = {c.id for c in target} ... return [ ... Correspondence(source=c.id, target=c.id, relation="equivalent", matcher="id") ... for c in source ... if c.id in target_ids ... ] >>> >>> isinstance(IdMatcher(), Matcher) True