dataeval.data.Embeddings¶
-
class dataeval.data.Embeddings(dataset, batch_size, transforms=
None, model=None, device=None, cache=False, verbose=False)¶ Collection of image embeddings from a dataset.
Embeddings are accessed by index or slice and are only loaded on-demand.
- Parameters:¶
- dataset : ImageClassificationDataset or ObjectDetectionDataset¶
Dataset to access original images from.
- batch_size : int¶
Batch size to use when encoding images.
- transforms : Transform or Sequence[Transform] or None, default None¶
Transforms to apply to images before encoding.
- model : torch.nn.Module or None, default None¶
Model to use for encoding images.
- device : DeviceLike or None, default None¶
The hardware device to use if specified, otherwise uses the DataEval default or torch default.
- cache : Path, str, or bool, default False¶
Whether to cache the embeddings to a file or in memory. When a Path or string is provided, embeddings will be cached to disk.
- verbose : bool, default False¶
Whether to print progress bar when encoding images.
- cache¶
The path to cache embeddings to file, or True if caching to memory.
- device¶
The hardware device to use if specified, otherwise uses the DataEval default or torch default.
- Type:¶
torch.device
-
classmethod from_array(array, device=
None)¶ Instantiates a shallow Embeddings object using an array.
Example
>>> import numpy as np >>> from dataeval.data import Embeddings >>> array = np.random.randn(100, 3, 224, 224) >>> embeddings = Embeddings.from_array(array) >>> print(embeddings.to_tensor().shape) torch.Size([100, 3, 224, 224])
- classmethod load(path)¶
Loads the embeddings from disk.
- new(dataset)¶
Creates a new Embeddings object with the same parameters but a different dataset.