dataeval.detectors.ood.OOD_AE

class dataeval.detectors.ood.OOD_AE(model, device=None)

Autoencoder based out-of-distribution detector.

Parameters:
model : torch.nn.Module

An autoencoder model to use for encoding and reconstruction of images for detection of out-of-distribution samples.

device : DeviceLike or None, default None

The hardware device to use if specified, otherwise uses the DataEval default or torch default.

Example

Perform out-of-distribution detection on test data.

>>> from dataeval.utils.torch.models import AE
>>> input_shape = train_images[0].shape
>>> ood = OOD_AE(AE(input_shape))

Train the autoencoder using the training data.

>>> ood.fit(train_images, threshold_perc=99, epochs=20)

Test for out-of-distribution samples on the test data.

>>> output = ood.predict(test_images)
>>> output.is_ood
array([ True,  True, False,  True,  True,  True,  True,  True])
fit(x_ref, threshold_perc, loss_fn=None, optimizer=None, epochs=20, batch_size=64, verbose=False)

Train the model and infer the threshold value.

Parameters:
x_ref : ArrayLike

Training data.

threshold_perc : float, default 100.0

Percentage of reference data that is normal.

loss_fn : Callable | None, default None

Loss function used for training.

optimizer : Optimizer, default keras.optimizers.Adam

Optimizer used for training.

epochs : int, default 20

Number of training epochs.

batch_size : int, default 64

Batch size used for training.

verbose : bool, default True

Whether to print training progress.

Return type:

None

predict(X, batch_size=int(10000000000.0), ood_type='instance')

Predict whether instances are out of distribution or not.

Parameters:
X : ArrayLike

Input data for out-of-distribution prediction.

batch_size : int, default 1e10

Number of instances to process in each batch.

ood_type : "feature" | "instance", default "instance"

Predict out-of-distribution at the ‘feature’ or ‘instance’ level.

Raises:

ValueError – X input data must be unit interval [0-1].

Returns:

  • Dictionary containing the outlier predictions for the selected level,

  • and the OOD scores for the data including both ‘instance’ and ‘feature’ (if present) level scores.

Return type:

dataeval.outputs.OODOutput

score(X, batch_size=int(10000000000.0))

Compute the out of distribution scores for a given dataset.

Parameters:
X : ArrayLike

Input data to score.

batch_size : int, default 1e10

Number of instances to process in each batch. Use a smaller batch size if your dataset is large or if you encounter memory issues.

Raises:

ValueError – X input data must be unit interval [0-1].

Returns:

An object containing the instance-level and feature-level OOD scores.

Return type:

OODScoreOutput