dataeval.core.cluster

dataeval.core.cluster(data)

Uses hierarchical clustering on the flattened data and returns clustering information.

Parameters:
data : ArrayLike, shape - (N, ...)

A dataset in an ArrayLike format. Function expects the data to have 2 or more dimensions which will flatten to (N, P) where N number of observations in a P-dimensional space.

Return type:

ClusterData

Notes

The cluster function works best when the length of the feature dimension, P, is less than 500. If flattening a CxHxW image results in a dimension larger than 500, then it is recommended to reduce the dimensions.

Example

>>> cluster(clusterer_images).clusters
array([ 2,  0,  0,  0,  0,  0,  4,  0,  3,  1,  1,  0,  2,  0,  0,  0,  0,
        4,  2,  0,  0,  1,  2,  0,  1,  3,  0,  3,  3,  4,  0,  0,  3,  0,
        3, -1,  0,  0,  2,  4,  3,  4,  0,  1,  0, -1,  3,  0,  0,  0])