dataeval.outputs.ClustererOutput

class dataeval.outputs.ClustererOutput

Output class for clusterer().

clusters

Assigned clusters

Type:

NDArray[int]

mst

The minimum spanning tree of the data

Type:

NDArray[int]

linkage_tree

The linkage array of the data

Type:

NDArray[float]

condensed_tree

The condensed tree of the data

Type:

NDArray[float]

membership_strengths

The strength of the data point belonging to the assigned cluster

Type:

NDArray[float]

data()

The output data as a dictionary.

Return type:

dict[str, Any]

find_duplicates()

Finds duplicate and near duplicate data based on cluster average distance

Returns:

The exact duplicates and near duplicates as lists of related indices

Return type:

Tuple[List[List[int]], List[List[int]]]

find_outliers()

Retrieves Outliers based on when the sample was added to the cluster and how far it was from the cluster when it was added

Returns:

A numpy array of the outlier indices

Return type:

NDArray[int]

meta()

Metadata about the execution of the function or method for the Output class.

Return type:

ExecutionMetadata