dataeval.scope.CoverageOutput¶
- class dataeval.scope.CoverageOutput(data, *, uncovered_indices, coverage_radius, critical_value_radii)¶
A dataset’s per-class embedding-space coverage.
The wrapped DataFrame is the per-class breakdown — one row per class, sorted with the lowest-dispersion assessable classes first (the ones most worth collecting varied data for) and unassessable classes last — with columns
class,count,uncovered,uncovered_fraction,dispersion,isotropy,near_duplicate_fraction, andassessable.isotropyandnear_duplicate_fractionare null for classes below the sample floors that make them meaningful. The sample-level coverage detail hangs off it as attributes.- uncovered_indices¶
Indices of individual samples sitting in under-sampled regions of the full embedding space (from
coverage_adaptive()/coverage_naive()).- Type:¶
NDArray[np.intp]
- critical_value_radii¶
Per-sample distance to the
num_observations-th nearest neighbor — the raw density signal the uncovered set is thresholded from.- Type:¶
NDArray[np.float32]
- data()¶
Return the output data as a polars DataFrame.