dataeval.detectors.drift.DriftCVM¶
-
class dataeval.detectors.drift.DriftCVM(x_ref, p_val=
0.05, x_ref_preprocessed=False, update_x_ref=None, preprocess_fn=None, correction='bonferroni', n_features=None)¶ Drift detector employing the Cramér-von Mises (CVM) Drift Detection test.
The CVM test detects changes in the distribution of continuous univariate data. For multivariate data, a separate CVM test is applied to each feature, and the obtained p-values are aggregated via the Bonferroni or False Discovery Rate (FDR) corrections.
- Parameters:¶
- x_ref : ArrayLike¶
Data used as reference distribution.
- p_val : float | None, default 0.05¶
p-value used for significance of the statistical test for each feature. If the FDR correction method is used, this corresponds to the acceptable q-value.
- x_ref_preprocessed : bool, default False¶
Whether the given reference data
x_refhas been preprocessed yet. IfTrue, only the test dataxwill be preprocessed at prediction time. IfFalse, the reference data will also be preprocessed.- update_x_ref : UpdateStrategy | None, default None¶
Reference data can optionally be updated using an UpdateStrategy class. Update using the last n instances seen by the detector with LastSeenUpdateStrategy or via reservoir sampling with ReservoirSamplingUpdateStrategy.
- preprocess_fn : Callable | None, default None¶
Function to preprocess the data before computing the data drift metrics. Typically a dimensionality reduction technique.
- correction : "bonferroni" | "fdr", default "bonferroni"¶
Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).
- n_features : int | None, default None¶
Number of features used in the statistical test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.