dataeval.scope.Prioritize.Config

class dataeval.scope.Prioritize.Config

Configuration for Prioritize evaluator.

extractor

Feature extractor instance to use for extracting embeddings from data.

Type:

FeatureExtractor or None

batch_size

Batch size for embedding computation. When None, uses the global batch size from get_batch_size().

Type:

int or None, default None

method

Ranking method to use.

Type:

{“knn”, “kmeans_distance”, “kmeans_complexity”, “hdbscan_distance”, “hdbscan_complexity”}, default “knn”

k

Number of nearest neighbors for “knn” method.

Type:

int or None, default None

c

Number of clusters for clustering methods.

Type:

int or None, default None

n_init

Number of K-means initializations (kmeans methods only).

Type:

int or “auto”, default “auto”

max_cluster_size

Maximum cluster size for HDBSCAN methods.

Type:

int or None, default None

order

Sort direction for output indices.

Type:

{“easy_first”, “hard_first”}, default “easy_first”

policy

Selection policy to apply after ranking.

Type:

{“difficulty”, “stratified”, “class_balanced”}, default “difficulty”

num_bins

Number of bins for “stratified” policy.

Type:

int, default 50