dataeval.utils.data.Embeddings

class dataeval.utils.data.Embeddings(dataset, batch_size, model=None, device=None, verbose=False)

Collection of image embeddings from a dataset.

Embeddings are accessed by index or slice and are only loaded on-demand.

Parameters:
dataset : ImageClassificationDataset or ObjectDetectionDataset

Dataset to access original images from.

batch_size : int, optional

Batch size to use when encoding images.

model : torch.nn.Module, optional

Model to use for encoding images.

device : torch.device, optional

Device to use for encoding images.

verbose : bool, optional

Whether to print progress bar when encoding images.

to_tensor()

Converts entire dataset to embeddings.

Warning

Will process the entire dataset in batches and return embeddings as a single Tensor in memory.

Return type:

torch.Tensor