DataEval¶

DataEval analyzes datasets and models to give users the ability to train and test performant, unbiased, and reliable AI models and monitor data for impactful shifts to deployed models.

Our Mission¶

DataEval is an effective, powerful, and reliable set of tools for any T&E engineer. Throughout all stages of the machine learning lifecycle, DataEval supports model development, data analysis, and monitoring with state-of-the-art algorithms to help you solve difficult problems. With a focus on computer vision tasks, DataEval provides simple, but effective metrics for performance estimation, bias detection, and dataset cleaning.

DataEval is easy to install, supports a wide range of Python versions, and is compatible with many of the most popular packages in the scientific and T&E communities.

DataEval also has native interoperability between JATIC’s suite of tools when using MAITE-compliant datasets and models.

Key Features¶

DataEval empowers professionals across domains with tools designed to enhance their workflows. Explore capabilities specific to your role:

T&E Engineer

Maximize your evaluation processes with tools designed for accuracy and reliability:

Robust Testing: Reduce errors with metrics that reliably work with state of the art image classification and object detection datasets
Post-Deployment Monitoring: Keep models on track with easy-to-implement logging of Operational Drift metrics
Responsive metrics: Optimize evaluation with tailored guidance for error assessment and retraining.

📖 Learn more about tools for Test & Evaluation Engineers.

ML Engineer

Accelerate model development with powerful insights for training and deployment:

Model-Specific Metrics: Evaluate dataset Sufficiency and detect data/model complexity mismatches.
Performance Optimization: Establish bounds on real-world model performance for improved training strategies.
Drift Detection: Rapidly diagnose model degradation under Operational Drift to maintain model accuracy and stability.

📖 Learn more about tools for Machine Learning Engineers.

Data Scientist

Drive innovation with data-focused tools that uncover hidden patterns and complexities:

Metafeatures: Leverage metrics to analyze data complexity and improve data-driven decisions based on metadata features.
Real-World Insights: Improve dataset sampling and improve Balance, Completeness, and Coverage.
Error Analysis: Gain actionable feedback to refine datasets and improve performance.
Complete Shift Analysis: Quantify impactful changes in data due to Covariate Shift, Label Shift, and Concept Drift before they impact your model.

📖 Learn more about tools for Data Scientists.

Acknowledgement¶

CDAO Funding Acknowledgement¶

This material is based upon work supported by the Chief Digital and Artificial Intelligence Office under Contract No. W519TC-23-9-2033. The views and conclusions contained herein are those of the author(s) and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government.