dataeval.core.compute_cluster_stats

dataeval.core.compute_cluster_stats(embeddings, clusters)

Compute cluster centers and distance statistics for adaptive outlier detection.

Parameters:
embeddings : NDArray[np.floating]

The embedding vectors, shape (n_samples, n_features)

clusters : NDArray[np.int64]

Cluster labels from HDBSCAN (-1 for HDBSCAN outliers)

Returns:

Pre-calculated statistics with empty arrays if no valid clusters found

Return type:

ClusterStats