API Reference#

DataEval’s API is split into several submodules which support specific goals and are detailed below. The base module is empty except for __version__ by design.

Submodules#

dataeval.detectors

Detectors can determine if a dataset or individual images in a dataset are indicative of a specific issue.

dataeval.detectors.drift

Drift detectors identify if the statistical properties of the data has changed.

dataeval.detectors.linters

Linters help identify potential issues in training and test data and are an important aspect of data cleaning.

dataeval.detectors.ood

Out-of-distribution (OOD)` detectors identify data that is different from the data used to train a particular model.

dataeval.metrics

Metrics are a way to measure the performance of your models or datasets that can then be analyzed in the context of a given problem.

dataeval.metrics.bias

Bias metrics check for skewed or imbalanced datasets and incomplete feature representation which may impact model performance.

dataeval.metrics.estimators

Estimators calculate performance bounds and the statistical distance between datasets.

dataeval.metrics.stats

Statistics metrics calculate a variety of image properties and pixel statistics and label statistics against the images and labels of a dataset.

dataeval.utils

The utility classes and functions are provided by DataEval to assist users in setting up architectures that are guaranteed to work with applicable DataEval metrics.

dataeval.utils.tensorflow

TensorFlow models are used in out of distribution detectors in the dataeval.detectors.ood module.

dataeval.utils.torch

PyTorch is the primary backend for metrics that require neural networks.

dataeval.workflows

Workflows perform a sequence of actions to analyze the dataset and make predictions.