dataeval.protocols.EvaluationSchedule

class dataeval.protocols.EvaluationSchedule

Protocol for determining evaluation points in sufficiency analysis.

Implementations determine at which dataset sizes to train and evaluate the model during sufficiency analysis.

Examples

Custom scheduler evaluating at 0%, 50%, 100% of the dataset

>>> class MidpointSchedule:
...     def get_step(self, dataset_length: int) -> np.typing.NDArray[np.intp]:
...         return np.array([0, dataset_length // 2, dataset_length - 1], dtype=np.intp)
get_steps(dataset_length)

Calculate evaluation points for given dataset length.

Parameters:
dataset_length : int

Total length of training dataset

Returns:

Array of dataset sizes at which to evaluate, must be monotonically increasing and within [1, dataset_length]

Return type:

NDArray[np.intp]