dataeval.utils.data.selections.Prioritize¶
-
class dataeval.utils.data.selections.Prioritize(model: torch.nn.Module, batch_size: int, device: dataeval.config.DeviceLike | None, method: 'knn', *, k: int | None =
None)¶ -
class Prioritize(model, batch_size, device, method, *, c=
None)¶ Prioritizes the dataset by sort order in the embedding space.
- Parameters:¶
- model : torch.nn.Module¶
Model to use for encoding images
- batch_size : int¶
Batch size to use when encoding images
- device : DeviceLike or None¶
Device to use for encoding images
- method : Literal["knn", "kmeans_distance", "kmeans_complexity"]¶
Method to use for prioritization
- k : int | None, default None¶
Number of nearest neighbors to use for prioritization (knn only)
- c : int | None, default None¶
Number of clusters to use for prioritization (kmeans only)
-
classmethod using(method: 'knn', *, k: int | None =
None, embeddings: dataeval.utils.data.Embeddings | None =None, reference: dataeval.utils.data.Embeddings | None =None) Prioritize¶ -
classmethod using(method: 'kmeans_distance' | 'kmeans_complexity', *, c: int | None =
None, embeddings: dataeval.utils.data.Embeddings | None =None, reference: dataeval.utils.data.Embeddings | None =None) Prioritize Prioritizes the dataset by sort order in the embedding space using existing embeddings and/or reference dataset embeddings.
- Parameters:¶
- method : Literal["knn", "kmeans_distance", "kmeans_complexity"]¶
Method to use for prioritization
- embeddings : Embeddings or None, default None¶
Embeddings to use for prioritization
- reference : Embeddings or None, default None¶
Reference embeddings to prioritize relative to
- k : int or None, default None¶
Number of nearest neighbors to use for prioritization (knn only)
- c : int or None, default None¶
Number of clusters to use for prioritization (kmeans, cluster only)
Notes
At least one of embeddings or reference must be provided.