dataeval.detectors.drift.DriftUncertainty

class dataeval.detectors.drift.DriftUncertainty(x_ref, model, p_val=0.05, x_ref_preprocessed=False, update_x_ref=None, preds_type='probs', batch_size=32, preprocess_batch_fn=None, device=None)

Test for a change in the number of instances falling into regions on which the model is uncertain.

Performs a K-S test on prediction entropies.

Parameters:
x_ref : ArrayLike

Data used as reference distribution.

model : Callable

Classification model outputting class probabilities (or logits)

p_val : float, default 0.05

P-Value used for the significance of the test.

x_ref_preprocessed : bool, default False

Whether the given reference data x_ref has been preprocessed yet. If True, only the test data x will be preprocessed at prediction time. If False, the reference data will also be preprocessed.

update_x_ref : UpdateStrategy | None, default None

Reference data can optionally be updated using an UpdateStrategy class. Update using the last n instances seen by the detector with LastSeenUpdateStrategy or via reservoir sampling with ReservoirSamplingUpdateStrategy.

preds_type : "probs" | "logits", default "probs"

Type of prediction output by the model. Options are ‘probs’ (in [0,1]) or ‘logits’ (in [-inf,inf]).

batch_size : int, default 32

Batch size used to evaluate model. Only relevant when backend has been specified for batch prediction.

preprocess_batch_fn : Callable | None, default None

Optional batch preprocessing function. For example to convert a list of objects to a batch which can be processed by the model.

device : str | None, default None

Device type used. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either ‘cuda’ or ‘cpu’.

Example

>>> model = ClassificationModel()
>>> drift = DriftUncertainty(x_ref, model=model, batch_size=20)

Verify reference images have not drifted

>>> drift.predict(x_ref.copy()).drifted
False

Test incoming images for drift

>>> drift.predict(x_test).drifted
True
predict(x)

Predict whether a batch of data has drifted from the reference data.

Parameters:
x : ArrayLike

Batch of instances.

Returns:

Dictionary containing the drift prediction, p-value, and threshold statistics.

Return type:

DriftUnvariateOutput