Drift Uncertainty

Drift refers to the phenomenon where the statistical properties of the data change over time. It occurs when the underlying distribution of the input features or the target variable (what the model is trying to predict) shifts, leading to a discrepancy between the training data and the real-world data the model encounters during deployment.

Through concepts examined in the NeurIPS 2019 paper Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift, we can utilize various methods in order to determine if drift is detected. For high-dimensional data, we typically want to reduce the dimensionality before performing tests against the dataset. To do so, we incorporate Untrained AutoEncoders (UAE) and Black-Box Shift Estimation (BBSE) predictors using the classifier’s softmax outputs as out-of-the box preprocessing methods and note that Principal Component Analysis can also be easily implemented using scikit-learn. Preprocessing methods which do not rely on the classifier will usually pick up drift in the input data, while BBSE focuses on label shift.

How-To Guides

Check out this how to to begin using the Drift Detection class

Drift Detection Tutorial

DataEval API

Classifier Uncertainty

The classifier uncertainty drift detector aims to directly detect drift that is likely to effect the performance of a model of interest. The approach is to test for change in the number of instances falling into regions of the input space on which the model is uncertain in its predictions. For each instance in the reference set the detector obtains the model’s prediction and some associated notion of uncertainty. The same is done for the test set and if significant differences in uncertainty are detected (via a Kolmogorov-Smirnov test) then drift is flagged. The detector’s reference set should be disjoint from the model’s training set (on which the model’s confidence may be higher).

Test for a change in the number of instances falling into regions on which the model is uncertain. Performs a K-S test on prediction entropies.

Parameters:

x_ref (ArrayLike) – Data used as reference distribution. Should be disjoint from the data the model was trained on for accurate p-values.
model (Callable) – Classification model outputting class probabilities (or logits)
p_val (float, default 0.05) – p-value used for the significance of the test.
x_ref_preprocessed (bool, default False) – Whether the given reference data x_ref has been preprocessed yet. If x_ref_preprocessed=True, only the test data x will be preprocessed at prediction time. If x_ref_preprocessed=False, the reference data will also be preprocessed.
update_x_ref (Optional[UpdateStrategy], default None) – Reference data can optionally be updated using an UpdateStrategy class. Update using the last n instances seen by the detector with dataeval.detectors.LastSeenUpdateStrategy or via reservoir sampling with dataeval.detectors.ReservoirSamplingUpdateStrategy.
preds_type (Literal["probs", "logits"], default "logits") – Type of prediction output by the model. Options are ‘probs’ (in [0,1]) or ‘logits’ (in [-inf,inf]).
batch_size (int, default 32) – Batch size used to evaluate model. Only relevant when backend has been specified for batch prediction.
preprocess_batch_fn (Optional[Callable], default None) – Optional batch preprocessing function. For example to convert a list of objects to a batch which can be processed by the model.
device (Optional[str], default None) – Device type used. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either ‘cuda’, ‘gpu’ or ‘cpu’.
input_shape (Optional[tuple], default None) – Shape of input data.

Predict whether a batch of data has drifted from the reference data.

Parameters:: x (ArrayLike) – Batch of instances.
Return type:: Dictionary containing the drift prediction, p-value, and threshold statistics.