dataeval.detectors.linters#

Linters help identify potential issues in training and test data and are an important aspect of data cleaning.

Classes#

Clusterer

Uses hierarchical clustering to flag dataset properties of interest like Outliers and duplicates

ClustererOutput

Output class for Clusterer lint detector

Duplicates

Finds the duplicate images in a dataset using xxhash for exact duplicates

DuplicatesOutput

Output class for Duplicates lint detector

Outliers

Calculates statistical Outliers of a dataset using various statistical tests applied to each image

OutliersOutput

Output class for Outliers lint detector