DriftUncertainty#

class dataeval.detectors.drift.DriftUncertainty(x_ref: ArrayLike, model: Callable, p_val: float = 0.05, x_ref_preprocessed: bool = False, update_x_ref: UpdateStrategy | None = None, preds_type: Literal['probs', 'logits'] = 'probs', batch_size: int = 32, preprocess_batch_fn: Callable | None = None, device: str | None = None)#

Test for a change in the number of instances falling into regions on which the model is uncertain.

Performs a K-S test on prediction entropies.

Parameters:
  • x_ref (ArrayLike) – Data used as reference distribution.

  • model (Callable) – Classification model outputting class probabilities (or logits)

  • p_val (float, default 0.05) – P-Value used for the significance of the test.

  • x_ref_preprocessed (bool, default False) – Whether the given reference data x_ref has been preprocessed yet. If True, only the test data x will be preprocessed at prediction time. If False, the reference data will also be preprocessed.

  • update_x_ref (UpdateStrategy | None, default None) – Reference data can optionally be updated using an UpdateStrategy class. Update using the last n instances seen by the detector with LastSeenUpdateStrategy or via reservoir sampling with ReservoirSamplingUpdateStrategy.

  • preds_type ("probs" | "logits", default "logits") – Type of prediction output by the model. Options are ‘probs’ (in [0,1]) or ‘logits’ (in [-inf,inf]).

  • batch_size (int, default 32) – Batch size used to evaluate model. Only relevant when backend has been specified for batch prediction.

  • preprocess_batch_fn (Callable | None, default None) – Optional batch preprocessing function. For example to convert a list of objects to a batch which can be processed by the model.

  • device (str | None, default None) – Device type used. The default None tries to use the GPU and falls back on CPU if needed. Can be specified by passing either ‘cuda’, ‘gpu’ or ‘cpu’.

predict(x: ArrayLike) DriftOutput#

Predict whether a batch of data has drifted from the reference data.

Parameters:

x (ArrayLike) – Batch of instances.

Returns:

Dictionary containing the drift prediction, p-value, and threshold statistics.

Return type:

DriftUnvariateOutput