dataeval

DataEval provides a simple interface to characterize visual data and its impact on model performance.

It works across classification and object-detection tasks. It also provides capabilities to select and curate datasets to test and train performant, robust, unbiased and reliable AI models and monitor for data shifts that impact performance of deployed models.

Submodules

bias

Check for skewed or imbalanced datasets and incomplete feature representation.

config

Global configuration settings for DataEval.

core

Core stateless functions for performing dataset, metadata and model evaluation.

data

Dataset organization tools: conform, filter, split, and reshape dataset views.

exceptions

Exception and warning classes for DataEval.

extractors

Feature extractors that transform input data into arrays.

flags

Module for flag enums that control function behavior.

models

MAITE Model implementations for opinionated ONNX/LiteRT prediction.

performance

Determine whether a problem is feasible and how much data is needed.

protocols

Common type protocols used for interoperability with DataEval.

quality

Identify potential issues in training and test data.

scope

Evaluate data completeness and coverage of a dataset’s label and embedding space.

shift

Detect changes in data between different datasets.

types

Data types used in DataEval.

utils

DataEval utilities organized by domain.

Classes

Embeddings

Collection of image embeddings from a dataset.

Metadata

Collection of binned metadata using Polars DataFrames.

Ontology

An immutable, in-memory directed acyclic graph of OntologyConcept.

Functions

log

Add a handler to the logger quickly for debugging.