dataeval.utils.datasets.MNIST

class dataeval.utils.datasets.MNIST(root, image_set='train', corruption=None, transforms=None, download=False, verbose=False)

MNIST Dataset and Corruptions.

There are 15 different styles of corruptions. This class downloads differently depending on if you need just the original dataset or if you need corruptions. If you need both a corrupt version and the original version then choose corruption=”identity” as this downloads all of the corrupt datasets and provides the original as identity. If you just need the original, then using corruption=None will download only the original dataset to save time and space.

Parameters:
root : str or pathlib.Path

Root directory where the data should be downloaded to or the minst folder of the already downloaded data.

image_set : "train", "test" or "base", default "train"

If “base”, returns all of the data to allow the user to create their own splits.

corruption : "identity", "shot_noise", "impulse_noise", "glass_blur", "motion_blur", "shear", "scale", "rotate", "brightness", "translate", "stripe", "fog", "spatter", "dotted_line", "zigzag", "canny_edges" or None, default None

Corruption to apply to the data.

transforms : Transform, Sequence[Transform] or None, default None

Transform(s) to apply to the data.

download : bool, default False

If True, downloads the dataset from the internet and puts it in root directory. Class checks to see if data is already downloaded to ensure it does not create a duplicate download.

verbose : bool, default False

If True, outputs print statements.

path

Location of the folder containing the data.

Type:

pathlib.Path

image_set

The selected image set from the dataset.

Type:

“train”, “test” or “base”

index2label

Dictionary which translates from class integers to the associated class strings.

Type:

dict[int, str]

label2index

Dictionary which translates from class strings to the associated class integers.

Type:

dict[str, int]

metadata

Typed dictionary containing dataset metadata, such as id which returns the dataset class name.

Type:

DatasetMetadata

corruption

Corruption applied to the data.

Type:

str or None

transforms

The transforms to be applied to the data.

Type:

Sequence[Transform]

size

The size of the dataset.

Type:

int

Note

Data License: CC BY 4.0 for corruption dataset