dataeval.detectors.drift.DriftUncertainty¶

class dataeval.detectors.drift.DriftUncertainty(data, model, p_val=0.05, update_strategy=None, correction='bonferroni', preds_type='probs', batch_size=32, transforms=None, device=None)¶

Test for a change in the number of instances falling into regions on which the model is uncertain.

Performs a K-S test on prediction entropies.

Parameters:¶

data : Array¶: Data used as reference distribution.
model : Callable¶: Classification model outputting class probabilities (or logits)
p_val : float, default 0.05¶: P-Value used for the significance of the test.
update_strategy : UpdateStrategy or None, default None¶: Reference data can optionally be updated using an UpdateStrategy class. Update using the last n instances seen by the detector with LastSeenUpdateStrategy or via reservoir sampling with ReservoirSamplingUpdateStrategy.
correction : "bonferroni" or "fdr", default "bonferroni"¶: Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).
preds_type : "probs" or "logits", default "probs"¶: Type of prediction output by the model. Options are ‘probs’ (in [0,1]) or ‘logits’ (in [-inf,inf]).
batch_size : int, default 32¶: Batch size used to evaluate model. Only relevant when backend has been specified for batch prediction.
transforms : Transform, Sequence[Transform] or None, default None¶: Transform(s) to apply to the data.
device : DeviceLike or None, default None¶: Device type used. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either ‘cuda’ or ‘cpu’.

Example

>>> model = ClassificationModel()
>>> drift = DriftUncertainty(x_ref, model=model, batch_size=20)

Verify reference images have not drifted

>>> drift.predict(x_ref.copy()).drifted
False

Test incoming images for drift

>>> drift.predict(x_test).drifted
True

predict(x)¶

Predict whether a batch of data has drifted from the reference data.

Parameters:¶

Returns:¶

Dictionary containing the drift prediction, p-value, and threshold statistics.

Return type:¶

DriftUnvariateOutput

property x_ref : numpy.typing.NDArray[numpy.float32]¶

Retrieve the reference data of the drift detector.