Functional Overview¶
The following tables summarize the advised use cases and technical requirements for the algorithms provided by the DataEval library. Each algorithm targets different types of data or problem domains. Refer to the method-specific pages by clicking the algorithms for more detailed information.
Computer Vision Task Compatibility¶
The following tables show the compatible computer vision tasks that have support in DataEval. The tables are split into categories based on usage and follow DataEval’s public API.
Algorithm |
Description |
Image Classification |
Object Detection |
Unsupervised |
|---|---|---|---|---|
|
Assesses the mutual information between factors |
✔ |
✔ |
|
|
Determines feasibility of image classification by estimating the bayes error rate |
✔ |
||
|
Computes statistical summaries of target boxes |
✔ |
||
Measures the degree to which images span the learned embedding space |
✔ |
✔ |
✔ |
|
|
Measures how well the distribution of images in a dataset covers the input space |
✔ |
✔ |
✔ |
|
Computes statistical summaries of image and target box dimensions |
✔ |
✔ |
✔ |
|
Measures the difference between dataset distributions |
✔ |
✔ |
✔ |
|
Measures the distribution of metadata factors in the dataset |
✔ |
✔ |
✔ |
|
Computes statistical summaries of images in a dataset |
✔ |
✔ |
✔ |
Assesses equivalence in label frequency between datasets |
✔ |
✔ |
||
|
Computes statistical summaries of labels in a dataset |
✔ |
✔ |
|
|
Calculates performance metrics for random classifiers on training and testing labels based on the class distributions |
✔ |
✔ |
|
Detects if there is a significant relationship between the factor values and class labels |
✔ |
✔ |
||
Determines feasibility of an object detection task by estimating upper bound on average precision |
✔ |
Algorithm |
Description |
Image Classification |
Object Detection |
Unsupervised |
|---|---|---|---|---|
|
Detects data distribution shifts from training data |
✔ |
✔ |
✔ |
Identifies duplicate data entries |
✔ |
✔ |
✔ |
|
|
Detects data points that fall outside the training distribution |
✔ |
✔ |
✔ |
Identifies anomalous data points based on deviations from mean |
✔ |
✔ |
✔ |
Algorithm |
Description |
Image Classification |
Object Detection |
Unsupervised |
|---|---|---|---|---|
|
Measures the greatest deviated metadata factors for detected out of distribution samples |
✔ |
✔ |
✔ |
|
Measures the most impactful factors for detected out of distribution samples |
✔ |
✔ |
✔ |
Algorithm |
Description |
Image Classification |
Object Detection |
Unsupervised |
|---|---|---|---|---|
Determines data needs for performance standards |
✔ |
✔ |
Algorithm |
Description |
Image Classification |
Object Detection |
Unsupervised |
|---|---|---|---|---|
Generates train, val, and test splits based on information such as labels and metadata |
✔ |
✔ |
✔ |
|
A set of dataset filters that enable rapid development of various datasets |
✔ |
✔ |
✔ |
Input Requirements¶
The following table shows the input parameters used by each of DataEval’s core functionalities.
For more information on a specific algorithm, click the name in the table.
For an overview, see the metrics page.
Algorithm |
Images |
Labels |
Bounding Boxes |
Metadata |
Scores |
|---|---|---|---|---|---|
|
Required |
Required |
|||
|
Required1 |
Required |
|||
|
Required |
Required |
|||
Required1 |
|||||
|
Required1 |
||||
|
Required2 |
||||
|
Required1 |
||||
|
Required |
Required |
|||
|
Required2 |
||||
Required |
|||||
|
Required |
||||
|
Required |
||||
Required |
Required |
||||
Required |
Required4 |
For more information on a specific algorithm, click the name in the table.
For an overview, see the detectors page.
Algorithm |
Images |
Labels |
Bounding Boxes |
Metadata |
Scores |
|---|---|---|---|---|---|
|
Required |
||||
Required2 |
|||||
|
Required |
||||
Required |
For more information on a specific algorithm, click the name in the table.
For an overview, see the metadata page.
Algorithm |
Images |
Labels |
Bounding Boxes |
Metadata3 |
Scores5 |
|---|---|---|---|---|---|
|
Required |
Required |
|||
|
Required |
Required |
For more information on a specific algorithm, click the name in the table.
For an overview, see the workflows page.
Algorithm |
Images |
Labels |
Bounding Boxes |
Metadata |
Scores |
Model |
|---|---|---|---|---|---|---|
Required |
Required |
OD Only |
For more information on a specific algorithm, click the name in the table.
For an overview, see the data page.
Algorithm |
Images |
Labels |
Bounding Boxes |
Metadata |
Scores |
Model |
|---|---|---|---|---|---|---|
Optional |
Optional3 |
|||||
Optional |
Optional |
Optional |
Optional |
Note
1 It is highly recommended to give embeddings over raw images using Embeddings.
2 Input data must be wrapped together in a Dataset.
3 When using only metadata, it must be wrapped in DataEval’s Metadata class.
4 These scores are the raw outputs of a model.
5 These scores are retrieved by DataEval’s Out Of Distribution functions.