dataeval.utils.data.Embeddings¶
-
class dataeval.utils.data.Embeddings(dataset, batch_size, transforms=
None, model=None, device=None, cache=False, verbose=False)¶ Collection of image embeddings from a dataset.
Embeddings are accessed by index or slice and are only loaded on-demand.
- Parameters:¶
- dataset : ImageClassificationDataset or ObjectDetectionDataset¶
Dataset to access original images from.
- batch_size : int¶
Batch size to use when encoding images.
- transforms : Transform or Sequence[Transform] or None, default None¶
Transforms to apply to images before encoding.
- model : torch.nn.Module or None, default None¶
Model to use for encoding images.
- device : DeviceLike or None, default None¶
The hardware device to use if specified, otherwise uses the DataEval default or torch default.
- cache : bool, default False¶
Whether to cache the embeddings in memory.
- verbose : bool, default False¶
Whether to print progress bar when encoding images.
-
classmethod from_array(array, device=
None)¶ Instantiates a shallow Embeddings object using an array.
Example
>>> import numpy as np >>> from dataeval.utils.data._embeddings import Embeddings >>> array = np.random.randn(100, 3, 224, 224) >>> embeddings = Embeddings.from_array(array) >>> print(embeddings.to_tensor().shape) torch.Size([100, 3, 224, 224])