dataeval.utils.data.Embeddings

class dataeval.utils.data.Embeddings(dataset, batch_size, model=None, device=None, verbose=False)

Collection of image embeddings from a dataset.

Embeddings are accessed by index or slice and are only loaded on-demand.

Parameters:
dataset : ImageClassificationDataset or ObjectDetectionDataset

Dataset to access original images from.

batch_size : int

Batch size to use when encoding images.

model : torch.nn.Module or None, default None

Model to use for encoding images.

device : DeviceLike or None, default None

The hardware device to use if specified, otherwise uses the DataEval default or torch default.

verbose : bool, default False

Whether to print progress bar when encoding images.

to_tensor(indices=None)

Converts dataset to embeddings.

Parameters:
indices : Sequence[int] or None, default None

The indices to convert to embeddings

Return type:

torch.Tensor

Warning

Processing large quantities of data can be resource intensive.