dataeval.detectors.ood.OOD_AE

class dataeval.detectors.ood.OOD_AE(model, device=None)

Autoencoder based out-of-distribution detector.

Parameters:
model : torch.nn.Module

An autoencoder model to use for encoding and reconstruction of images for detection of out-of-distribution samples.

device : str or torch.Device or None, default None

The device to use for the detector. None will default to the global configuration selection if set, otherwise “cuda” then “cpu” by availability.

Example

Perform out-of-distribution detection on test data.

>>> from dataeval.utils.torch.models import AE
>>> input_shape = train_images[0].shape
>>> ood = OOD_AE(AE(input_shape))

Train the autoencoder using the training data.

>>> ood.fit(train_images, threshold_perc=99, epochs=20)

Test for out-of-distribution samples on the test data.

>>> output = ood.predict(test_images)
>>> output.is_ood
array([ True,  True, False,  True,  True,  True,  True,  True])
fit(x_ref, threshold_perc, loss_fn=None, optimizer=None, epochs=20, batch_size=64, verbose=False)

Train the model and infer the threshold value.

Parameters:
x_ref : ArrayLike

Training data.

threshold_perc : float, default 100.0

Percentage of reference data that is normal.

loss_fn : Callable | None, default None

Loss function used for training.

optimizer : Optimizer, default keras.optimizers.Adam

Optimizer used for training.

epochs : int, default 20

Number of training epochs.

batch_size : int, default 64

Batch size used for training.

verbose : bool, default True

Whether to print training progress.

Return type:

None

predict(X, batch_size=int(10000000000.0), ood_type='instance')

Predict whether instances are out of distribution or not.

Parameters:
X : ArrayLike

Input data for out-of-distribution prediction.

batch_size : int, default 1e10

Number of instances to process in each batch.

ood_type : "feature" | "instance", default "instance"

Predict out-of-distribution at the ‘feature’ or ‘instance’ level.

Raises:

ValueError – X input data must be unit interval [0-1].

Returns:

  • Dictionary containing the outlier predictions for the selected level,

  • and the OOD scores for the data including both ‘instance’ and ‘feature’ (if present) level scores.

Return type:

dataeval.outputs.OODOutput

score(X, batch_size=int(10000000000.0))

Compute the out of distribution scores for a given dataset.

Parameters:
X : ArrayLike

Input data to score.

batch_size : int, default 1e10

Number of instances to process in each batch. Use a smaller batch size if your dataset is large or if you encounter memory issues.

Raises:

ValueError – X input data must be unit interval [0-1].

Returns:

An object containing the instance-level and feature-level OOD scores.

Return type:

OODScoreOutput