dataeval.utils.dataset.datasets.MNIST¶
-
class dataeval.utils.dataset.datasets.MNIST(root, train=
True, download=False, size=-1, unit_interval=False, dtype=None, channels=None, flatten=False, normalize=None, corruption=None, classes=None, balance=True, randomize=True, slice_back=False, verbose=True)¶ MNIST Dataset and Corruptions.
- Parameters:¶
- root : str | pathlib.Path¶
str |
pathlib.PathRoot directory of dataset where themnist_c/folder exists.- train : bool¶
bool, default True If True, creates dataset from
train_images.npyandtrain_labels.npy.- download : bool¶
bool, default False If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
- size : int¶
int, default -1 Limit the dataset size, must be a value greater than 0.
- unit_interval : bool¶
bool, default False Shift the data values to the unit interval [0-1].
- dtype : type | None¶
type | None, default None Change the NumPy dtype - data is loaded as np.uint8
- channels : Literal['channels_first', 'channels_last'] | None¶
Literal[‘channels_first’ | ‘channels_last’] | None, default None Location of channel axis if desired, default has no channels (N, 28, 28)
- flatten : bool¶
bool, default False Flatten data into single dimension (N, 784) - cannot use both channels and flatten, channels takes priority over flatten.
- normalize : tuple[float, float] | None¶
tuple[mean, std] | None, default None Normalize images acorrding to provided mean and standard deviation
- corruption : CorruptionStringMap | None¶
Literal[‘identity’ | ‘shot_noise’ | ‘impulse_noise’ | ‘glass_blur’ | ‘motion_blur’ | ‘shear’ | ‘scale’ | ‘rotate’ | ‘brightness’ | ‘translate’ | ‘stripe’ | ‘fog’ | ‘spatter’ | ‘dotted_line’ | ‘zigzag’ | ‘canny_edges’] | None, default None The desired corruption style or None.
- classes : TClassMap | None¶
Literal[“zero”, “one”, “two”, “three”, “four”, “five”, “six”, “seven”, “eight”, “nine”] | int | list[int] | list[Literal[“zero”, “one”, “two”, “three”, “four”, “five”, “six”, “seven”, “eight”, “nine”]] | None, default None Option to select specific classes from dataset.
- balance : bool¶
bool, default True If True, returns equal number of samples for each class.
- randomize : bool¶
bool, default True If True, shuffles the data prior to selection - uses a set seed for reproducibility.
- slice_back : bool¶
bool, default False If True and size has a value greater than 0, then grabs selection starting at the last image.
- verbose : bool¶
bool, default True If True, outputs print statements.