dataeval.protocols.Matcher

class dataeval.protocols.Matcher

Protocol for an element-level matcher used in ontology alignment.

A matcher proposes candidate Correspondence objects between a source and a target vocabulary, each supplied as an iterable of OntologyConcept (an Ontology satisfies this directly). It is the extension seam of dataeval.core.label_alignment(): exact terminological anchoring is built into label_alignment itself, while additional matchers (string-similarity, and later embedding / instance-based matchers) are supplied via its matchers argument and consulted for source concepts the exact pass left unanchored.

A matcher needs only each concept’s id, label, and synonyms (parents is available for light structural hints) so it can be written and tested against a plain list of concepts.

Implementations should be permissive — propose any plausible correspondence with a calibrated confidence — and let label_alignment apply the acceptance threshold and pick the best proposal per source concept. A matcher need not deduplicate or resolve conflicts itself.

Example

A trivial matcher proposing an equivalence for an exact id match:

>>> from dataeval.types import Correspondence
>>> from dataeval.protocols import Matcher
>>>
>>> class IdMatcher:
...     def __call__(self, source, target):
...         target_ids = {c.id for c in target}
...         return [
...             Correspondence(source=c.id, target=c.id, relation="equivalent", matcher="id")
...             for c in source
...             if c.id in target_ids
...         ]
>>>
>>> isinstance(IdMatcher(), Matcher)
True