dataeval.Ontology¶
- class dataeval.Ontology(concepts)¶
An immutable, in-memory directed acyclic graph of
OntologyConcept.The graph is built from a collection of concepts linked by their
parents(is-a edges). A concept may have more than one parent, so the graph is a DAG rather than a tree; cycles are rejected. Parent ids referencing concepts not present in the collection are kept as external references — they participate in ancestor/LCA queries but are not themselves concepts.Once built, the graph is queryable for ancestors, descendants, siblings, lowest common ancestors, depth, and rooted subtrees, and resolves class names to concepts via
find().- Parameters:¶
- concepts : Iterable[OntologyConcept]¶
Concepts comprising the ontology. Ids must be unique.
- Raises:¶
OntologyError – If two concepts share an id.
OntologyCycleError – If the is-a graph contains a cycle.
See also
Ontology.from_rdfBuild from in-memory RDF/OWL/JSON-LD content.
Ontology.from_hierarchyBuild from a plain nested dict / list (no rdflib).
- ancestors(concept_id)¶
Return all ancestor ids of a concept, nearest-first (breadth-first).
Ancestors are the concept’s transitive superclasses (broader concepts). May include external reference ids. Raises
KeyErrorifconcept_idis not a defined concept.
- children(concept_id)¶
Return the ids of the direct subclasses (children) of
concept_id.Children are the defined concepts that declare
concept_idamong theirparents; order follows concept insertion order. Unlikedescendants()this is the immediate, non-transitive layer. RaisesKeyErrorifconcept_idis not a defined concept.
- concept(concept_id)¶
Return the concept for
concept_id(raisesKeyErrorif absent).
- depth_of(concept_id)¶
Return the length of the longest is-a path from a root to
concept_id.A concept with no parents has depth 0; a concept whose only parent is an external reference has depth 1. Raises
KeyErrorifconcept_idis not a defined concept.
- descendants(concept_id)¶
Return all descendant concept ids of
concept_id, nearest-first.Descendants are the concept’s transitive subclasses (narrower concepts). Raises
KeyErrorifconcept_idis not a defined concept.
- find(name)¶
Resolve a human-readable name (or exact id) to matching concept ids.
Matching is case-insensitive over each concept’s preferred label and synonyms. An exact id match is also returned.
- classmethod from_hierarchy(data)¶
Build an
Ontologyfrom a plain, hand-authored hierarchy.A dependency-free constructor for the common case where you don’t have an RDF/OWL file. Labels double as concept ids (no IRIs, synonyms, or definitions). Accepts:
a flat list of labels:
["car", "dog"]a one-level mapping:
{"car": ["sedan", "SUV"], "dog": None}an arbitrarily nested mapping:
{"vehicle": {"car": {"sedan": None}}}
Mapping values may be
None(leaf), a list of labels (children), or a nested mapping. A label appearing under more than one parent yields a DAG.- Parameters:¶
- data : Mapping or Sequence¶
The hierarchy specification.
- Return type:¶
- Raises:¶
OntologyError – If a label is not a string or a node has an unexpected type.
OntologyCycleError – If the hierarchy contains a cycle.
-
classmethod from_rdf(source, *, format=
None)¶ Build an
Ontologyfrom in-memory RDF content.Parses already-in-memory serialized RDF (OWL/RDF-XML, Turtle, N-Triples, JSON-LD, …) via
rdflib. This does not read files; callers should load file contents themselves and pass the text/bytes.
- classmethod from_rdflib(graph)¶
Build an
Ontologyfrom an in-memoryrdflib.Graph.Concepts are collected from subjects typed
owl:Class/rdfs:Class/skos:Conceptand from any subject ofrdfs:subClassOf/skos:broader. For each:labelisskos:prefLabel(falling back tordfs:label),synonymsareskos:altLabel(plus a differingrdfs:label),parentsare the IRI objects ofrdfs:subClassOf/skos:broader, anddefinitionisskos:definition. Blank-node superclasses (e.g.owl:Restriction) are ignored.
- is_a(a, b)¶
Return whether concept
ais a (transitive) subclass ofb.Equivalently, whether
bis an ancestor (superclass) ofa. RaisesKeyErrorifais not a defined concept;bmay be any id, including an external reference.
- lowest_common_ancestor(a, b)¶
Return a single lowest common ancestor of
aandb, orNone.A deterministic projection of
lowest_common_ancestors(): on a tree the LCA is unique; on a DAG with several incomparable lowest common ancestors this returns the deepest (the id with the most ancestors), ties broken by id. Uselowest_common_ancestors()to get the full set. ReturnsNonewhen the two share no ancestor; may return an external reference id.Raises
KeyErrorifaorbis not a defined concept.
- lowest_common_ancestors(a, b)¶
Return all lowest common ancestors of
aandb, id-sorted.A common ancestor is an id in both concepts’ ancestor sets; a concept counts as an ancestor of itself, so the LCA of a concept and its descendant is the concept itself. A common ancestor is lowest when none of its own descendants is also a common ancestor. On a tree this is always a single id, but on a DAG two concepts may meet at several mutually incomparable points, so the result may hold more than one. May include an external reference id (the meeting point can lie outside the defined concepts). Returns an empty tuple when the two share no ancestor.
Raises
KeyErrorifaorbis not a defined concept.
- siblings(concept_id)¶
Return defined concepts sharing at least one parent with
concept_id.Excludes the concept itself. Siblings under an external (undefined) parent are included, so this works on subset ontologies. Raises
KeyErrorifconcept_idis not a defined concept.
- subtree(concept_id)¶
Return a new
Ontologyrooted atconcept_id.Contains the concept and all its descendants; parent links pointing outside the subtree are pruned so
concept_idbecomes a root. RaisesKeyErrorifconcept_idis not a defined concept.
- subtree_ids(concept_id)¶
Return
concept_idtogether with all its descendant ids (its subtree).A lightweight id-set form of
subtree(), for membership and disjointedness tests that do not need a full sub-ontology. RaisesKeyErrorifconcept_idis not a defined concept.
- property external_ids : tuple[str, Ellipsis]¶
Ids referenced as parents but not present as defined concepts.
These are external references: the ontology references them (e.g. it was distributed as a subset) but does not define them, so they have no label, definition, or further ancestors. Their presence means the is-a hierarchy is truncated at those points.
- property label_collisions : dict[str, tuple[str, Ellipsis]]¶
Case-folded names that resolve to more than one concept.
Each entry maps a normalized name (a preferred label or synonym shared across concepts) to the distinct concept ids
find()would return for it — the artifact-side source of reconciliation ambiguity. Empty when every name resolves uniquely. Unlikefind(), exact-id matches are not considered, since an id is unique by construction.