dataeval.workflows.SufficiencyOutput

class dataeval.workflows.SufficiencyOutput

Output class for Sufficiency workflow.

steps

Array of sample sizes

Type:

NDArray

measures

3D array [runs, substep, classes] of values for all runs observed for each sample size step for each measure

Type:

dict[str, NDArray]

averaged_measures

Average of values for all runs observed for each sample size step for each measure

Type:

dict[str, NDArray]

n_iter

Number of iterations to perform in the basin-hopping curve-fit process

Type:

int, default 1000

unit_interval

Constrains the power law to the interval [0, 1]. Set True (default) for metrics such as accuracy, precision, and recall which are defined to take values on [0,1]. Set False for metrics not on the unit interval.

Type:

bool, default True

data()

The output data as a dictionary.

Return type:

dict[str, Any]

inv_project(targets, n_iter=None)

Calculate training samples needed to achieve target model metric values.

Parameters:
targets : Mapping[str, ArrayLike]

Mapping of target metric scores (from 0.0 to 1.0) that we want to achieve, where the key is the name of the metric.

n_iter : int or None, default None

Iteration to use when calculating the inverse power curve, if None defaults to 1000

Returns:

List of the number of training samples needed to achieve each corresponding entry in targets

Return type:

Mapping[str, NDArray]

meta()

Metadata about the execution of the function or method for the Output class.

Return type:

ExecutionMetadata

plot(class_names=None, error_bars=True, asymptote=True)

Plotting function for data sufficience tasks.

Parameters:
class_names : Sequence[str] | None, default None

List of class names

error_bars : bool, default True

True if error bars should be plotted, False if not

asymptote : bool, default True

True if asymptote should be plotted, False if not

Returns:

List of Figures for each measure

Return type:

Sequence[Figure]

Raises:

ValueError – If the length of data points in the measures do not match

Notes

This method requires matplotlib to be installed.

project(projection)

Projects the measures for each step.

Parameters:
projection : int | Iterable[int]

Step or steps to project

Returns:

Dataclass containing the projected measures per projection

Return type:

SufficiencyOutput

Raises:

ValueError – If the length of data points in the measures do not match If projection is not numerical