dataeval.protocols.EvaluationSchedule

class dataeval.protocols.EvaluationSchedule

Protocol for determining evaluation points in sufficiency analysis.

Implementations determine at which dataset sizes to train and evaluate the model during sufficiency analysis.

Examples

Custom scheduler evaluating at 0%, 50%, 100% of the dataset

>>> class MidpointSchedule:
...     def get_steps(self, dataset_length: int) -> np.typing.NDArray[np.intp]:
...         return np.array([0, dataset_length // 2, dataset_length - 1], dtype=np.intp)
get_steps(dataset_length)

Compute evaluation points for given dataset length.

Parameters:
dataset_length : int

Total length of training dataset

Returns:

Array of dataset sizes at which to evaluate, must be monotonically increasing and within [1, dataset_length]

Return type:

NDArray[np.intp]