DataEval Change Log#
v0.69.4#
π Miscellaneous
7bca6ed4- Unified all MNIST and MNIST corrupt datasets to a single internal MNIST class66ad1c92- new drift detector: multivariate domain classifier
v0.69.3#
π Miscellaneous
6745e39d- Document: Class Label Statistical Independence and Coverage Documentation1f7689ac- Adding bias tutorial (parity-balance-diversity)
v0.69.2#
π Miscellaneous
f7d5bac3- Adds stats for bounding boxes18be58a3- Adding label stats809d1d7a- Always produce p-val and distance metrics for drift5cd7c205- Improving imagestats and channelstats functionsb379d44c- Add dataset splitting features80b68a73- Use regex to replace markdown links1d99455a- Tag LKG at the correct commit SHAad0e368b- Always run tasks
v0.69.1#
π Miscellaneous
d9068a2c- Fix release and changelog script
v0.69.0#
π Miscellaneous
63ab70d7- Remove automatic update of documentation notebooks
v0.68.0#
π Feature Release
47b48e14- Allow Duplicates and Outliers detectors to take in multiple StatsOutput objects
π Miscellaneous
65d8f3de- Combine classwise bias metric outputs with non-classwiseccfd72ef- Adding clustering/coverage tutorial6d09d710- Add CONTRIBUTING.md72387d9c- Updated version replacement script to include cache files5285f01b- Prototype Performance Estimation3ae16116- concept pages for balance and diversity, rescale Simpson diversity3e16a905- Switching documentation themes to the pydata theme
v0.67.0#
π Feature Release
a0b04800- Refactor DataEval functions and classes and update documentationChanges DataEval functions and classes to be more hierarchical in modules:
detectors
drift (DriftCVS, DriftKS, DriftMMD, DriftUncertainty)
linters (Clusterer, Duplicates, Outliers)
ood (OOD_AE, OOD_AEGMM, OOD_LLR, OOD_VAE, OOD_VAEGMM)
flags (ImageStat)
metrics
bias (balance, coverage, diversity, parity)
estimators (ber, divergence, uap)
stats (imagestats, channelstats)
workflows (Sufficiency)
Backends have been moved from
modelstotensorflowandtorchRenamed following classes:
Linter->Outliersparity->label_parityparity_metadata->parityDriftOutput->DriftBaseOutputDriftUnivariateOutput->DriftOutput
Miscellaneous fixes:
Documentation updated
Streamlined optional import checks in the
__init__.pytreeFixed misspelling in glossary
πΎ Fixes
84aae760- balance test cleanup
π Miscellaneous
6d09d710- Add CONTRIBUTING.md72387d9c- Updated version replacement script to include cache files5285f01b- Prototype Performance Estimation3ae16116- concept pages for balance and diversity, rescale Simpson diversity3e16a905- Switching documentation themes to the pydata themed50d9cd1- Update Landing Page2fd7fa59- Author drift detection tutorial49b5af42- Use uv instead of pyenv for python deployment0f6eb6b0- Pin notebooks on release to specific version4f101a4e- Adjust imagestats and channelstats reference guides to new format0ee82ede- Only build data image in main pipeline7b84ceb5- Improve test coveraged3c5258a- Add StatsOutput as input type for linter and duplicatescf73393a- Updates drift reference guides and concept page4ce5cdf7- Adjust model reference guides to new format17195a2b- Adjust parity reference guides to new formate9761b4d- Adjust out of distribution reference guides to new formateaf707a7- Adjust uap reference guide to new format335ac3be- Adjust sufficiency reference guide to new format3a866f01- Change Optional[Type] to Type | None per 3.10+ standards
v0.66.0#
π Feature Release
a0b04800- Refactor DataEval functions and classes and update documentationChanges DataEval functions and classes to be more hierarchical in modules:
detectors
drift (DriftCVS, DriftKS, DriftMMD, DriftUncertainty)
linters (Clusterer, Duplicates, Outliers)
ood (OOD_AE, OOD_AEGMM, OOD_LLR, OOD_VAE, OOD_VAEGMM)
flags (ImageStat)
metrics
bias (balance, coverage, diversity, parity)
estimators (ber, divergence, uap)
stats (imagestats, channelstats)
workflows (Sufficiency)
Backends have been moved from
modelstotensorflowandtorchRenamed following classes:
Linter->Outliersparity->label_parityparity_metadata->parityDriftOutput->DriftBaseOutputDriftUnivariateOutput->DriftOutput
Miscellaneous fixes:
Documentation updated
Streamlined optional import checks in the
__init__.pytreeFixed misspelling in glossary
π οΈ Improvements and Enhancements
5f730baa- Refactor ImageStats and ChannelStats as metric functions
πΎ Fixes
84aae760- balance test cleanup3ebd278c- handle float-type categorical variables in balance metric066b7153- Fixes modzscore to account for division by 0
π Miscellaneous
d50d9cd1- Update Landing Page2fd7fa59- Author drift detection tutorial49b5af42- Use uv instead of pyenv for python deployment0f6eb6b0- Pin notebooks on release to specific version4f101a4e- Adjust imagestats and channelstats reference guides to new format0ee82ede- Only build data image in main pipeline7b84ceb5- Improve test coveraged3c5258a- Add StatsOutput as input type for linter and duplicatescf73393a- Updates drift reference guides and concept page4ce5cdf7- Adjust model reference guides to new format17195a2b- Adjust parity reference guides to new formate9761b4d- Adjust out of distribution reference guides to new formateaf707a7- Adjust uap reference guide to new format335ac3be- Adjust sufficiency reference guide to new format3a866f01- Change Optional[Type] to Type | None per 3.10+ standardsfe1e292d- Use output dataclass with metadatab3f6a027- Unify handling of image reshaping
v0.65.0#
π οΈ Improvements and Enhancements
5f730baa- Refactor ImageStats and ChannelStats as metric functions
πΎ Fixes
3ebd278c- handle float-type categorical variables in balance metric066b7153- Fixes modzscore to account for division by 0
π Miscellaneous
fe1e292d- Use output dataclass with metadatab3f6a027- Unify handling of image reshaping
v0.64.0#
π Feature Release
bea0446c- Torch Dataset Reader
π οΈ Improvements and Enhancements
eda88822- Refactor metrics
π Miscellaneous
a4b8e919- Created new documentation issue templates1028d082- Remove is_arraylike functiondbcecec6- Refactored read_dataset to handle common dataset returns61b1f854- Updated Workflow Landing Pagecf96c7f2- Run doctest in CI pipelineecfcf89b- Adjusted notebooks to work on google colab and added environment requirements5f863782- Update remaining metric output to NamedTuplee58f4dba- Add metadata parity documentation6319a1d4- Adding Duplicates concept787545f5- Adding ImageStats and ChannelStats concept document7826405c- Update Data Cleaning concept50047116- Change to Semantic Versioning9e43399c- Bayes Error Rate - explanation documentation266ad738- Updated BER docstrings with NDArray, shapes, and examples
v0.63.0#
π οΈ Improvements and Enhancements
3225cf18- Convert remaining metrics and detectors to ArrayLike5d88b82a- Add Torch and Tensorflow interop through ArrayLike protocol and to_numpy converterd3342275- Refactor linter and duplicates to call evaluate with data65d5aaa8- Refactor metrics to call evaluate with data
v0.61.0#
π οΈ Improvements and Enhancements
cd59debb- Release DataEval v0.61.0!DAML is now officially rebranded as DataEval! New name, same great camel flavor.
v0.56.0#
π Feature Release
64416675- Update clusterer class and documentationClustererdetector released
This class assists in exploratory data analysis of unlabeled data by identifying duplicates and outliers. Additional information on usage is available in our documentation.
v0.55.0#
π Feature Release
278b4dc1- Release Linter, Duplicates, ImageStats, ChannelStats and ParityLinter,Duplicatesdetectors andImageStats,ChannelStats, andParitymetrics are now released. The existing metrics available have also been moved into different modules (detectorsandworkflows) that better reflect their functionality.detectorsDrift detectors:
DriftCVM,DriftKS,DriftMMD,DriftUncertaintyand supporting classesOut-of-distribution detectors:
OOD_AE,OOD_AEGMM,OOD_LLR,OOD_VAE,OOD_VAEGMMand supporting classesLinterDuplicates
metricsBERDivergenceParityImageStatsChannelStatsUAP
workflowsSufficiency
v0.54.0#
π οΈ Improvements and Enhancements
58263ac7- Move niter param to evaluate and calculate and retain curve coefficients in output dictionaryThis change enhances the output of the
Sufficiencymetric to provide the coefficients for the learning curve by measure/class when running the metric. These parameters were previously recalculated each call to project and plot. The parameters are provided as aDict[str, np.ndarray]under the_CURVE_PARAMS_key in the output dictionary.
v0.53.0#
π Feature Release
322fc830- Add parameterkto BER estimator for KNN to enablek>1for better consistency with ground truth in certain cases
v0.52.0#
π οΈ Improvements and Enhancements
07b12ac2- Fully integrate outlier detectionOutlier Detection API has been changed. Additional details are available in our documentation.
v0.51.0#
π Feature Release
2ed88a07- Implement Drift Detection MetricsThis change adds 4 types of Drift Detection metrics which allow for the detection of potential drift in the dataset.
Kolmogorov-Smirnov
CramΓ©r-von Mises
Maximum Mean Discrepancy
Classifier Uncertainty
The conceptual source is derived from Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift and the implementation is derived from Alibi-Detect v0.11.4.
v0.45.0#
π§ Deprecations and Removals
5cc48bec- Divergence metric naming corrected to HP DivergenceDivergence metric output now returns a dictionary of
{ "divergence": float, "error": int }instead of{ "dpdivergence": float, "error": int }. Code, documentation and tutorials have been updated to the correct nomenclature of HP (Henze-Penrose) divergence.
v0.44.6#
π Feature Release
41b20d3a- Add rules for release label pipeline workflow and merge request release template
π οΈ Improvements and Enhancements
7ee53c9c- Update Divergence default to MST
v0.44.2#
π οΈ Improvements and Enhancements
1468aa5c- Switch to markdown and updated docs
v0.43.0#
π οΈ Improvements and Enhancements
670a0db5- Add support for classwise Sufficiency metricsb96ee099- Have sufficiency train and eval functions take indices and batch size instead of a DataLoader
v0.42.2#
π οΈ Improvements and Enhancements
5225c491- Change output classes to dictionaries45040682- Make Sufficiency a stateful class and revise SufficiencyOutput7c5fdcff- Pass method as a parameter to determine metric algorithm to use2e883f6d- Add better optimizer to find global minimumc3c78680- Expose AETrainer to public API to use model multiple times after training
πΎ Fixes
93564b95- Updating pyproject.toml and lock file to set dependency less than numpy 2.0
v0.42.0#
π οΈ Improvements and Enhancements
601cfae8- Sufficiency Plotting of Multiple Metrics during one run3d68a6f1- Add parameter to plot function for optional file output
π§ Deprecations and Removals
a6ce3e72- Remove UAP_MST metric
v0.40.2#
π οΈ Improvements and Enhancements
f3eddaed- Flavor 2 - Remove models from metrics entirely
v0.40.1#
π§ Deprecations and Removals
db888bb7- Remove usage of DamlDataset for ARiA metrics
v0.38.1#
π οΈ Improvements and Enhancements
42617f43- Enable GPU functionality in pytorch features
v0.38.0#
π Feature Release
c9b5116e- ARiA Autoencoder as PyTorch Model
π οΈ Improvements and Enhancements
8fe97232- Add export_model functionality and improve test coverage42cc77ea- Add empirical upper bound to UAP metric output
πΎ Fixes
636dfdaf- update project with version metadata
v0.36.1#
π Feature Release
7d1a599f- Implement the uap class
v0.36.0#
π οΈ Improvements and Enhancements
0799523b- Object detection model training
v0.29.0#
π Feature Release
166df3b0- Implement Dataset Sufficiency Metric
π οΈ Improvements and Enhancements
5c4e6e06- Use convolutional autoencoder for BER and Divergence metrics
πΎ Fixes
c78e5502- Sufficiency typecheck bugfix
v0.28.5#
π οΈ Improvements and Enhancements
9d1c354c- Add fit_dataset, format_dataset to DpDivergence & BER
v0.28.4#
πΎ Fixes
c39e009e- Fix typecheck issues found with pyright-1.1.333
v0.26.13#
π Feature Release
949e09bd- Add kNN BER implementation
v0.26.10#
π οΈ Improvements and Enhancements
dab0a8ff- Handle MST edge cases
v0.26.4#
π οΈ Improvements and Enhancements
bf31996f- BER lower bound capability
v0.25.11#
π οΈ Improvements and Enhancements
dfe0bddb- Add support for python 3.11
v0.25.4#
π οΈ Improvements and Enhancements
2ca285cc- update BER metric to return a dataclass instead of dict
v0.25.3#
πΎ Fixes
67f08b27- Fix: Alibi-detect-models-have-fixed-architecture-shapes
v0.25.2#
π οΈ Improvements and Enhancements
db4adaff- 69 convert metric output dictionary to dataclass
v0.24.8#
π Feature Release
79614577- Implement Multiclass MST version of BER
v0.24.6#
π Feature Release
2ad9fed5- Implement BER estimate
v0.23.1#
π Feature Release
99d2fd22- Implement outlier detection metrics using the alibi-detect VAE method
v0.23.0#
π Feature Release
85eb2c1f- Implement outlier detection metrics using the alibi-detect auto-encoder method