dataeval.detectors.ood.OOD_AE¶

class dataeval.detectors.ood.OOD_AE(model, device=None)¶

Autoencoder based out-of-distribution detector.

Parameters:¶

model : torch.nn.Module¶: An autoencoder model to use for encoding and reconstruction of images for detection of out-of-distribution samples.
device : DeviceLike or None, default None¶: The hardware device to use if specified, otherwise uses the DataEval default or torch default.

Example

Perform out-of-distribution detection on test data.

>>> from dataeval.utils.torch.models import AE

>>> input_shape = train_images[0].shape
>>> ood = OOD_AE(AE(input_shape))

Train the autoencoder using the training data.

>>> ood.fit(train_images, threshold_perc=99, epochs=20)

Test for out-of-distribution samples on the test data.

>>> output = ood.predict(test_images)
>>> output.is_ood
array([ True,  True, False,  True,  True,  True,  True,  True])

fit(x_ref, threshold_perc, loss_fn=None, optimizer=None, epochs=20, batch_size=64, verbose=False)¶

Train the model and infer the threshold value.

Parameters:¶

x_ref : ArrayLike¶: Training data.
threshold_perc : float, default 100.0¶: Percentage of reference data that is normal.
loss_fn : Callable | None, default None¶: Loss function used for training.
optimizer : Optimizer, default keras.optimizers.Adam¶: Optimizer used for training.
epochs : int, default 20¶: Number of training epochs.
batch_size : int, default 64¶: Batch size used for training.
verbose : bool, default True¶: Whether to print training progress.

Return type:¶

None

predict(X, batch_size=int(10000000000.0), ood_type='instance')¶

Predict whether instances are out of distribution or not.

Parameters:¶

X : ArrayLike¶: Input data for out-of-distribution prediction.
batch_size : int, default 1e10¶: Number of instances to process in each batch.
ood_type : "feature" | "instance", default "instance"¶: Predict out-of-distribution at the ‘feature’ or ‘instance’ level.

Raises:¶

ValueError – X input data must be unit interval [0-1].

Returns:¶

Dictionary containing the outlier predictions for the selected level,
and the OOD scores for the data including both ‘instance’ and ‘feature’ (if present) level scores.

Return type:¶

dataeval.outputs.OODOutput

score(X, batch_size=int(10000000000.0))¶

Compute the out of distribution scores for a given dataset.

Parameters:¶

X : ArrayLike¶: Input data to score.
batch_size : int, default 1e10¶: Number of instances to process in each batch. Use a smaller batch size if your dataset is large or if you encounter memory issues.

Raises:¶

ValueError – X input data must be unit interval [0-1].

Returns:¶

An object containing the instance-level and feature-level OOD scores.

Return type:¶

OODScoreOutput