dataeval.detectors.linters

Linters help identify potential issues in training and test data and are an important aspect of data cleaning.

Classes

Duplicates

Finds the duplicate images in a dataset using xxhash for exact duplicates and pchash for near duplicates.

Outliers

Calculates statistical outliers of a dataset using various statistical tests applied to each image.

Output Classes

DuplicatesOutput

Output class for Duplicates lint detector.

OutliersOutput

Output class for Outliers lint detector.