dataeval.core.LabelCoverageResult¶
- class dataeval.core.LabelCoverageResult¶
Observed distribution of a dataset’s label mass over an
Ontology.Every per-concept mapping is keyed over all defined concepts, so an unlabeled concept appears with a zero/empty entry rather than being absent — that visibility of the unpopulated parts of the ontology is the whole point. All fields are observations; none assumes an expected distribution.
- matched¶
Dataset class name to the single concept id it resolved to. Distinct names may resolve to the same id (synonyms); their counts are summed downstream.
- unmatched¶
Class name to its count, for names that resolved to no concept — label mass the ontology does not cover (a missing concept, or a junk label).
- ambiguous¶
Class name to the more-than-one candidate concept ids it resolved to. Their counts are not attributed to any concept; resolve them upstream (e.g. by passing concept ids) to fold them into the coverage tallies.
- direct_count¶
Concept id to the label mass landing exactly on it (0 when unlabeled). Labels may land on internal concepts, not only leaves.
- subtree_count¶
Concept id to the mass on it plus all its descendants (its subtree). On a DAG a multi-parent concept contributes to every ancestor’s subtree but is counted once per ancestor.
- covered_leaves¶
Concept id to
(covered, total)leaf species in its subtree, where a leaf is covered if it has any direct mass. The breadth-of-coverage signal at a glance:(0, n)is a wholly dark branch.
- covered_children¶
Concept id to
(covered, total)direct children whose subtree holds any mass — sibling fill under each parent. Leaves report(0, 0).
- coverage_by_depth¶
Is-a depth to
(covered, total)concepts at that depth, where covered means the concept’s subtree holds any mass. The depth profile of coverage.
- leaf_coverage¶
Fraction of the ontology’s leaf species with any direct mass — a single observed coverage scalar (no prior).
0.0when the ontology has no leaves.