dataeval.core¶
Core stateless functions for performing dataset, metadata and model evaluation.
Functions¶
|
An estimator for Multi-class Bayes error rate using KNN test statistic basis. |
|
An estimator for Multi-class Bayes error rate using FR with a minimum spanning tree (MST) test statistic basis. |
|
Compute specified statistics on a set of images, optionally within bounding boxes. |
|
Calculate box-to-image ratios from calculate() output. |
|
Uses hierarchical clustering on the flattened data and returns clustering |
|
Compute cluster centers and distance statistics for adaptive outlier detection. |
|
For each sample in data_query, compute the k nearest neighbors in data_fit. |
|
Evaluate coverage using an adaptive radius calculation method. |
|
Evaluate coverage using a naive radius calculation method. |
|
Compute difference hash (dHash) for an image. |
|
Compute orientation-invariant difference hash using gradients. |
|
Calculates the divergence by counting the label disagreements between nearest neighbors |
|
Calculates the divergence by counting the number of "between dataset" edges in the |
|
Determine greatest deviation in metadata features per sample. |
|
Computes mutual information between metadata factors and flagged sample indices. |
|
Measures the feature-wise distance between two continuous distributions and computes a |
|
Identifies potential label errors in a dataset using embedding geometry. |
|
Calculate the chi-square statistic to assess the parity between expected and observed label distributions. |
|
Calculates statistics for data labels. |
|
Compute the minimum spanning tree of a dataset. |
|
Mutual information between factors (class label, metadata, label/image properties), |
|
Mutual information (MI) between factors (class label, metadata, label/image properties), |
|
Calculates accuracy from binary classification results. |
|
Calculates FPR (False Positive Rate) from binary classification results. |
|
Calculate null model metrics (dummy classifiers metrics) for given class distributions. |
|
Calculates precision from binary classification results. |
|
Calculates recall (True Positive Rate) from binary classification results. |
|
Calculate statistical parity using Bias-Corrected Cramér's V. |
|
Compute perceptual hash using Discrete Cosine Transform (DCT). |
|
Compute orientation-invariant perceptual hash using DCT. |
|
Rank samples using HDBSCAN cluster complexity weighting. |
|
Rank samples using distance to HDBSCAN cluster centers. |
|
Rank samples using cluster complexity weighting. |
|
Rank samples using distance to cluster centers. |
|
Rank samples using k-nearest neighbors distance. |
|
Transform RankResult indices using class-balanced selection. |
|
Transform RankResult indices using stratified sampling. |
|
FR Test Statistic based estimate of the empirical mean precision for the upperbound average precision. |
|
Compute fast non-cryptographic hash using xxHash algorithm. |