Home#

Welcome to DataEval’s Documentation#

DataEval curates datasets to train and test performant, robust, unbiased and reliable AI models and monitors for data shifts that impact performance of deployed models.


Our Mission#

DataEval is an effective, powerful, and reliable set of tools for any T&E engineer. Throughout all stages of the machine learning lifecycle, DataEval supports **model development, data analysis, and monitoring with state-of-the-art algorithms to help you solve difficult problems. With a focus on computer vision tasks, DataEval provides simple, but effective metrics for performance estimation, bias detection, and dataset linting.

DataEval is easy to install, supports a wide range of Python versions, and is compatible with many of the most popular packages in the scientific and T&E communities. DataEval also has native interopability between JATIC’s suite of tools when using MAITE-compliant datasets and models.

Key Features#

DataEval provides many powerful tools to assist in the following T&E tasks:

  • Model-agnostic metrics that bound real-world performance

    • relevance/completeness/coverage

    • metafeatures (data complexity)

  • Model-specific metrics that guide model selection and training

    • dataset sufficiency

    • data/model complexity mismatch

  • Metrics for post-deployment monitoring of data with bounds on model performance to guide retraining

    • dataset-shift metrics

    • model performance bounds under covariate shift

    • guidance on sampling to assess model error and model retraining


Before You Begin#

Jump in and try out DataEval for yourself.

Installation Guide Link

Learn what drives us here at ARiA, our role in AI T&E, and why we created DataEval.

About DataEval Page Link

Quick overview detailing each algorithm’s functionality and requirements.

DataEval Algorithm Overview Chart

Get Started#

We are proud of our tools, so we highlighted some simple but powerful functionality that you can try yourself!

Accurately calculate the maximum performance of your dataset.

Bayes Error Rate Estimation Tutorial

Estimate your model’s performance based on the size of your dataset

Dataset Sufficiency Analysis for Classification Tutorial

Stay Practical#

These handcrafted guides created by experts here at ARiA will get you up and running while improving your day-to-day worklife.

Not sure where to begin?

Try out these guides to learn the ins and outs of AI T&E using DataEval.

Tutorial Page Link

Already know what you’re looking for?

Check out these curated guides to see how DataEval can improve your workflows.

How To Page Link

Be Theoretical#

Dive deep into the concepts that DataEval is built upon to enhance your skill set.

Need to understand the theory behind the math that makes DataEval so powerful?

Click through these focused guides on the research, implementation, and tradeoffs we used to better suit your needs.

Concept Page Link

Want in-depth understanding with no-code explanations?

Read these role-specific guides for the data analysis tasks you will see in your daily work.

Workflows Page Link

Get Technical#

Everything you need to become an expert with DataEval.

Looking for a specific function or class?

Find all the technical details needed to understand the DataEval Ecosystem.

Reference Page Link

Looking for a definition?

Find the word in the glossary.

Glossary Page link

Contributing#

DataEval is an open-source software that is open for anyone to request features, fix bugs, or reach out for help.

Follow our contributing guide to get started!

Changelog#

DataEval’s development changelog.

Attribution#

Alibi-Detect#

This project uses code from the Alibi-Detect Python library developed by SeldonIO.
Additional documentation from their developers is available on the Alibi-Detect documentation page.

CDAO Funding Acknowledgement#

This material is based upon work supported by the Chief Digital and Artificial Intelligence Office under Contract No. W519TC-23-9-2033. The views and conclusions contained herein are those of the author(s) and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government.