dataeval.utils.data.Embeddings

class dataeval.utils.data.Embeddings(dataset, batch_size, model=None, device=None, verbose=False)

Collection of image embeddings from a dataset.

Embeddings are accessed by index or slice and are only loaded on-demand.

Parameters:
dataset : ImageClassificationDataset or ObjectDetectionDataset

Dataset to access original images from.

batch_size : int

Batch size to use when encoding images.

model : torch.nn.Module or None, default None

Model to use for encoding images.

device : DeviceLike or None, default None

The hardware device to use if specified, otherwise uses the DataEval default or torch default.

verbose : bool, default False

Whether to print progress bar when encoding images.

to_tensor()

Converts entire dataset to embeddings.

Warning

Will process the entire dataset in batches and return embeddings as a single Tensor in memory.

Return type:

torch.Tensor