dataeval.core.compute_neighbors¶
-
dataeval.core.compute_neighbors(data_fit, data_query=
None, k=1, algorithm='auto')¶ For each sample in data_query, compute the k nearest neighbors in data_fit.
- Parameters:¶
- data_fit : ArrayND[float]¶
Reference points to search with shape (n_samples_fit, n_features). Can be an N dimensional list, or array-like object. This is the dataset that will be indexed for neighbor search.
- data_query : ArrayND[float]¶
Query points with shape (n_samples_query, n_features). Can be an N dimensional list, or array-like object. For each of these points, find k nearest neighbors in data_fit.
- k : int, default=1¶
The number of neighbors to find
- algorithm : {"auto", "ball_tree", "kd_tree"}, default="auto"¶
Tree method for nearest neighbor computation
- Returns:¶
Indices of k nearest neighbors in data_fit for each point in data_query. Shape is (n_samples_query,) if k=1, otherwise (n_samples_query, k)
- Return type:¶
NDArray[np.int64]
- Raises:¶
ValueError – If k < 1 or if algorithm is not “auto”, “ball_tree”, or “kd_tree”
Notes
Do not use kd_tree if n_features > 20
Examples
>>> import numpy as np >>> from dataeval.core import compute_neighbors >>> reference_data = np.random.rand(100, 5) # 100 reference points >>> query_data = np.random.rand(10, 5) # 10 query points>>> neighbors = compute_neighbors(reference_data, query_data, k=3) >>> neighbors.shape (10, 3)See also
sklearn.neighbors.NearestNeighborsSimilar sklearn interface
compute_neighbor_distancesFor self-query (single dataset)