dataeval.core.compute_cluster_stats

dataeval.core.compute_cluster_stats(embeddings, cluster_labels)

Compute cluster centers and distance statistics for adaptive outlier detection.

Parameters:
embeddings : NDArray[np.floating]

The embedding vectors, shape (n_samples, n_features)

cluster_labels : NDArray[np.int64] | _Clusters

Cluster labels returned from a clustering algorithm (-1 for outliers) or an internal _Clusters object

Returns:

Pre-calculated statistics with empty arrays if no valid clusters found

Return type:

ClusterStats