Out-of-Distribution (OOD) Detection Tutorial

Problem Statement

For most computer vision tasks like image classification and object detection, out-of-distribution (OOD) detection can provide insight into operational drift, or training problems. A way to identify these is through autoencoding reconstruction error.

To help with this, DataEval has an OOD detector that allows a user to identify these images.

When to use

The OOD_AE class and similar should be used when you would like to find individual images in a dataset which are the most different from the others in the provided set.

What you will need

  1. A training image dataset with the approximate percentage of known OOD images.

  2. A test image dataset to evaluate for OOD images.

  3. A python environment with the following packages installed:

    • dataeval[tensorflow] or dataeval[all]

    • tensorflow-datasets

Setting up

Let’s import the required libraries needed to set up a minimal working example

import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

from dataeval.detectors import OOD_AE, OOD_VAEGMM
from dataeval.models.tensorflow import AE, VAEGMM, create_model

tf.random.set_seed(108)
tf.keras.utils.set_random_seed(408)

Load the data

We will use the tensorflow mnist dataset for this tutorial

# Load in the mnist dataset from tensorflow datasets
(images, ds_info) = tfds.load(
    "mnist",
    split="train[:2000]",
    with_info=True,
)  # type: ignore

images = images.shuffle(images.cardinality())
tfds.visualization.show_examples(images, ds_info)
images = np.array([i["image"] for i in images], dtype=np.float32) / 255.0
input_shape = images[0].shape
../../_images/08abd5128ace7f4f213d1256bbe001bf75dff08fd7dbd626e62c1ddabf667a99.png

Initialize the model

Now, lets look at how to use DataEval’s OOD detection methods.
We will focus on a simple autoencoder network from our Alibi Detect provider

detectors = [
    OOD_AE(create_model(AE, input_shape)),
    OOD_VAEGMM(create_model(VAEGMM, input_shape)),
]

Train the model

Next we will train a model on the dataset. For better results, the epochs can be increased. We set the threshold to detect the most extreme 1% of training data as out-of-distribution.

for detector in detectors:
    print(f"Training {detector.__class__.__name__}...")
    detector.fit(images, threshold_perc=99, epochs=20, verbose=False)
Training OOD_AE...
Training OOD_VAEGMM...

Test for OOD

We have trained our detector on a dataset of digits.
What happens when we give it corrupted images of digits (which we expect to be “OOD”)?

corr_images, ds_info = tfds.load(
    "mnist_corrupted/translate",
    split="train[:2000]",
    with_info=True,
)  # type: ignore

corr_images = corr_images.shuffle(corr_images.cardinality())
tfds.visualization.show_examples(corr_images, ds_info)
corr_images = np.array([i["image"] for i in corr_images], dtype=np.float32) / 255.0
# corr_images = corr_images.ravel().reshape((corr_images.shape[0], -1))
print(corr_images.shape)
../../_images/7f57f746b69ef08be0c913518dcd8712a723dbe4ae5ed5b52cbc86767778e244.png
(2000, 28, 28, 1)

Now we evaluate the two datasets using the trained model.

[(type(detector).__name__, np.mean(detector.predict(images)["is_ood"])) for detector in detectors]
[('OOD_AE', 0.01), ('OOD_VAEGMM', 0.0115)]
[(type(detector).__name__, np.mean(detector.predict(corr_images)["is_ood"])) for detector in detectors]
[('OOD_AE', 0.995), ('OOD_VAEGMM', 0.007)]

Results

We can see that the Autoencoder based OOD detector was able to identify most of the translated images as outliers, while the AEGMM was resilient to the perturbation.

Depending on your needs, certain outlier detectors will work better under specific conditions.