dataeval.protocols.Chunker¶
- class dataeval.protocols.Chunker¶
Protocol for chunking datasets into subsets by returning index arrays.
Implementations must provide a __call__ method that takes the number of samples and returns a list of index arrays representing the chunks.
Examples
Creating a simple chunker that splits the dataset into equal parts:
>>> import numpy as np >>> from dataeval.protocols import Chunker >>> >>> class EqualChunker: ... def __init__(self, n_chunks: int): ... self.n_chunks = n_chunks ... ... def __call__(self, n: int) -> list[NDArray[np.intp]]: ... return [idx.astype(np.intp) for idx in np.array_split(np.arange(n), self.n_chunks)] >>> >>> chunker = EqualChunker(n_chunks=5) >>> isinstance(chunker, Chunker) True