Tensorflow Models

The Tensorflow models provided are tailored for usage with the outlier detection metrics. DataEval provides both basic default models through the utility function create_model as well as constructors which allow for customization of the encoder, decoder and any other applicable layers used by the model.

How does it work?

The encoder is trained to create dense embeddings for the images while the decoder is trained to reconstruct the new embedding into the original input image. The distances from the reconstructions between the test images and original images, or the probability distribution differences are used to measure how different they are and allow for the detection of outliers.

Tutorials

There are no tutorials for Tensorflow models yet, but we will be adding one soon.

How To Guides

There are currently no how to’s for Tensorflow models. If there are scenarios that you want us to explain, contact us!

DataEval API

Models

class dataeval.models.tensorflow.AE(encoder_net: keras.Model, decoder_net: keras.Model)

Combine encoder and decoder in AE.

Parameters:

encoder_net – Layers for the encoder wrapped in a keras.Sequential class.
decoder_net – Layers for the decoder wrapped in a keras.Sequential class.

class dataeval.models.tensorflow.AEGMM(encoder_net: keras.Model, decoder_net: keras.Model, gmm_density_net: keras.Model, n_gmm: int, recon_features: Callable = eucl_cosim_features)

Deep Autoencoding Gaussian Mixture Model.

Parameters:

encoder_net – Layers for the encoder wrapped in a keras.Sequential class.
decoder_net – Layers for the decoder wrapped in a keras.Sequential class.
gmm_density_net – Layers for the GMM network wrapped in a keras.Sequential class.
n_gmm – Number of components in GMM.
recon_features – Function to extract features from the reconstructed instance by the decoder.

class dataeval.models.tensorflow.PixelCNN(image_shape: tuple, conditional_shape: tuple | None = None, num_resnet: int = 5, num_hierarchies: int = 3, num_filters: int = 160, num_logistic_mix: int = 10, receptive_field_dims: tuple = (3, 3), dropout_p: float = 0.5, resnet_activation: str = 'concat_elu', l2_weight: float = 0.0, use_weight_norm: bool = True, use_data_init: bool = True, high: int = 255, low: int = 0, dtype=tf.float32, name: str = 'PixelCNN')

Construct Pixel CNN++ distribution.

Parameters:

image_shape – 3D TensorShape or tuple for the [height, width, channels] dimensions of the image.
conditional_shape – TensorShape or tuple for the shape of the conditional input, or None if there is no conditional input.
num_resnet – The number of layers (shown in Figure 2 of [2]) within each highest-level block of Figure 2 of [1].
num_hierarchies – The number of highest-level blocks (separated by expansions/contractions of dimensions in Figure 2 of [1].)
num_filters – The number of convolutional filters.
num_logistic_mix – Number of components in the logistic mixture distribution.
receptive_field_dims – Height and width in pixels of the receptive field of the convolutional layers above and to the left of a given pixel. The width (second element of the tuple) should be odd. Figure 1 (middle) of [2] shows a receptive field of (3, 5) (the row containing the current pixel is included in the height). The default of (3, 3) was used to produce the results in [1].
dropout_p – The dropout probability. Should be between 0 and 1.
resnet_activation – The type of activation to use in the resnet blocks. May be ‘concat_elu’, ‘elu’, or ‘relu’.
l2_weight – The L2 regularization weight.
use_weight_norm – If True then use weight normalization (works only in Eager mode).
use_data_init – If True then use data-dependent initialization (has no effect if use_weight_norm is False).
high – The maximum value of the input data (255 for an 8-bit image).
low – The minimum value of the input data.
dtype – Data type of the Distribution.
name – The name of the Distribution.

class dataeval.models.tensorflow.VAE(encoder_net: keras.Model, decoder_net: keras.Model, latent_dim: int, beta: float = 1.0)

Combine encoder and decoder in VAE.

Parameters:

encoder_net – Layers for the encoder wrapped in a keras.Sequential class.
decoder_net – Layers for the decoder wrapped in a keras.Sequential class.
latent_dim – Dimensionality of the latent space.
beta – Beta parameter for KL-divergence loss term.

class dataeval.models.tensorflow.VAEGMM(encoder_net: keras.Model, decoder_net: keras.Model, gmm_density_net: keras.Model, n_gmm: int, latent_dim: int, recon_features: Callable = eucl_cosim_features, beta: float = 1.0)

Variational Autoencoding Gaussian Mixture Model.

Parameters:

encoder_net – Layers for the encoder wrapped in a keras.Sequential class.
decoder_net – Layers for the decoder wrapped in a keras.Sequential class.
gmm_density_net – Layers for the GMM network wrapped in a keras.Sequential class.
n_gmm – Number of components in GMM.
latent_dim – Dimensionality of the latent space.
recon_features – Function to extract features from the reconstructed instance by the decoder.
beta – Beta parameter for KL-divergence loss term.

Reconstruction Functions

dataeval.models.tensorflow.eucl_cosim_features(x: Tensor, y: Tensor, max_eucl: float = 100.0) → Tensor

Compute features extracted from the reconstructed instance using the relative Euclidean distance and cosine similarity between 2 tensors.

Parameters:

x – Tensor used in feature computation.
y – Tensor used in feature computation.
max_eucl – Maximum value to clip relative Euclidean distance by.

Return type:

Tensor concatenating the relative Euclidean distance and cosine similarity features.

Loss Function Classes

class dataeval.models.tensorflow.LossGMM(w_recon: float = 1e-07, w_energy: float = 0.1, w_cov_diag: float = 0.005, elbo: Elbo | None = None)

Loss function used for AE and VAE with GMM.

Parameters:

w_recon – Weight on elbo loss term.
w_energy – Weight on sample energy loss term.
w_cov_diag – Weight on covariance regularizing loss term.
elbo – ELBO loss function used to calculate w_recon.

class dataeval.models.tensorflow.Elbo(cov_type: Literal['cov_full', 'cov_diag'] | float = 1.0, x: Tensor | ndarray | None = None)

Compute ELBO loss. The covariance matrix can be specified by passing the full covariance matrix, the matrix diagonal, or a scale identity multiplier. Only one of these should be specified. If none are specified, the identity matrix is used.

Parameters:

cov_type – Full covariance matrix, diagonal variance matrix, or scale identity multiplier.
x – Dataset used to calculate the covariance matrix. Required for full and diagonal covariance matrix types.

Utility Functions

Create a default model for the specified model type.

Parameters:

model_type – The model type to create.
input_shape – The input shape of the data used.
encoding_dim – The target encoding dimensionality.
n_gmm – Number of components used in the GMM layer.
gmm_latent_dim – Latent dimensionality of the GMM layer.