dataeval.utils.data.datasets.MNIST

class dataeval.utils.data.datasets.MNIST(root, train=True, download=False, size=-1, unit_interval=False, dtype=None, channels=None, flatten=False, normalize=None, corruption=None, classes=None, balance=True, randomize=True, slice_back=False, verbose=True)

MNIST Dataset and Corruptions.

Parameters:
root : str or pathlib.Path

Root directory of dataset where the mnist_c/ folder exists.

train : bool, default True

If True, creates dataset from train_images.npy and train_labels.npy.

download : bool, default False

If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.

size : int, default -1

Limit the dataset size, must be a value greater than 0.

unit_interval : bool, default False

Shift the data values to the unit interval [0-1].

dtype : type | None, default None

Change the NumPy dtype - data is loaded as np.uint8

channels : "channels_first", "channels_last" or None, default None

Location of channel axis if desired, default has no channels (N, 28, 28)

flatten : bool, default False

Flatten data into single dimension (N, 784) - cannot use both channels and flatten, channels takes priority over flatten.

normalize : tuple[mean, std] or None, default None

Normalize images acorrding to provided mean and standard deviation

corruption : "identity", "shot_noise", "impulse_noise", "glass_blur", "motion_blur", "shear", "scale", "rotate", "brightness", "translate", "stripe" "fog", "spatter", "dotted_line", "zigzag", "canny_edges" or None, default None

The desired corruption style or None.

classes : "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", int, list, or None, default None

Option to select specific classes from dataset.

balance : bool, default True

If True, returns equal number of samples for each class.

randomize : bool, default True

If True, shuffles the data prior to selection - uses a set seed for reproducibility.

slice_back : bool, default False

If True and size has a value greater than 0, then grabs selection starting at the last image.

verbose : bool, default True

If True, outputs print statements.