dataeval.scope.RepresentationOutput¶
- class dataeval.scope.RepresentationOutput(data, *, leaf_coverage, total_deficit, violations, dark_branches)¶
A dataset’s collection worklist against an
Ontology.The wrapped DataFrame is the worklist itself — one row per leaf species short of its expected share, sorted by
deficit(largest first) — with columnsconcept,label,parent,action("acquire"for unrepresented species,"augment"for under-represented ones),count,target, anddeficit. The summary scalars and supporting frames hang off it as attributes.- leaf_coverage¶
Fraction of the ontology’s leaf species with any examples (carried through from
label_coverage(); observation, not policy).
- total_deficit¶
Sum of all positive deficits — an estimate of how many labels the dataset is short of its expected distribution. The single budgeting number.
- violations¶
One row per asserted class (from
expected) whose observed share falls below its floor:concept,label,floor,actual,shortfall. Empty when no assertions were made or all held.- Type:¶
polars.DataFrame
- dark_branches¶
Maximal wholly-unpopulated internal branches, largest first:
concept,label,leaves(leaf species under that branch). The branch-level headline above the per-species worklist.- Type:¶
polars.DataFrame
- data()¶
Return the output data as a polars DataFrame.