Drift CVM

Drift refers to the phenomenon where the statistical properties of the data change over time. It occurs when the underlying distribution of the input features or the target variable (what the model is trying to predict) shifts, leading to a discrepancy between the training data and the real-world data the model encounters during deployment.

Through concepts examined in the NeurIPS 2019 paper Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift, we can utilize various methods in order to determine if drift is detected. For high-dimensional data, we typically want to reduce the dimensionality before performing tests against the dataset. To do so, we incorporate Untrained AutoEncoders (UAE) and Black-Box Shift Estimation (BBSE) predictors using the classifier’s softmax outputs as out-of-the box preprocessing methods and note that Principal Component Analysis can also be easily implemented using scikit-learn. Preprocessing methods which do not rely on the classifier will usually pick up drift in the input data, while BBSE focuses on label shift.

How-To Guides

Check out this how to to begin using the Drift Detection class

Drift Detection Tutorial

DataEval API

Cramér-von Mises (CVM)

The CVM drift detector is a non-parametric drift detector, which applies feature-wise two-sample Cramér-von Mises (CVM) tests. For two empirical distributions \(F(z)\) and \(F_{ref}(z)\), the CVM test statistic is defined as

\[ W = \sum_{z\in k} \left| F(z) - F_{ref}(z) \right|^2 \]

where \(k\) is the joint sample. The CVM test is an alternative to the Kolmogorov-Smirnov (K-S) two-sample test, which uses the maximum distance between two empirical distributions \(F(z)\) and \(F_{ref}(z)\). By using the full joint sample, the CVM can exhibit greater power against shifts in higher moments, such as variance changes.

For multivariate data, the detector applies a separate CVM test to each feature, and the p-values obtained for each feature are aggregated either via the Bonferroni or the False Discovery Rate (FDR) correction. The Bonferroni correction is more conservative and controls for the probability of at least one false positive. The FDR correction on the other hand allows for an expected fraction of false positives to occur. As with other univariate detectors such as the Kolmogorov-Smirnov detector, for high-dimensional data, we typically want to reduce the dimensionality before computing the feature-wise univariate FET tests and aggregating those via the chosen correction method.

Cramér-von Mises (CVM) data drift detector, which tests for any change in the distribution of continuous univariate data. For multivariate data, a separate CVM test is applied to each feature, and the obtained p-values are aggregated via the Bonferroni or False Discovery Rate (FDR) corrections.

Parameters:

x_ref (ArrayLike) – Data used as reference distribution.
p_val (float, default 0.05) – p-value used for significance of the statistical test for each feature. If the FDR correction method is used, this corresponds to the acceptable q-value.
x_ref_preprocessed (bool, default False) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
update_x_ref (Optional[UpdateStrategy], default None) – Reference data can optionally be updated using an UpdateStrategy class. Update using the last n instances seen by the detector with dataeval.detectors.LastSeenUpdateStrategy or via reservoir sampling with dataeval.detectors.ReservoirSamplingUpdateStrategy.
preprocess_fn (Optional[Callable[[ArrayLike], ArrayLike]], default None) – Function to preprocess the data before computing the data drift metrics. Typically a dimensionality reduction technique.
correction (Literal["bonferroni", "fdr"], default "bonferroni") – Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).
n_features – Number of features used in the statistical test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.

Predict whether a batch of data has drifted from the reference data and update reference data using specified update strategy.

Parameters:: x (ArrayLike) – Batch of instances.
Returns:: p-values, threshold after multivariate correction if needed and test statistics.
Return type:: Dictionary containing the drift prediction and optionally the feature level

Performs the two-sample Cramér-von Mises test(s), computing the p-value and test statistic per feature.

Parameters:: x (ArrayLike) – Batch of instances.
Return type:: Feature level p-values and CVM statistics.