Image Statistics
The basic ImageStats class assists with understanding the dataset.
This class can be used in conjunction with the Linter class to determine
if there are any issues with any of the images in the dataset.
This class can be used to get a big picture view of the dataset and it’s underlying distribution.
The stats delivered by the class is broken down into 3 main categories:
statistics covering image properties,
statistics covering the visual aspect of images,
and normal statistics about pixel values.
Below shows the statistics each category calculates.
Image Properties
height
width
size
aspect ratio
number of channels
pixel value range
Image Visuals
image brightness
image blurriness
missing values (NaNs)
number of 0 value pixels
Pixel Statistics
mean pixel value
pixel value standard deviation
pixel value variance
pixel value skew
pixel value kurtosis
entropy of the image
pixel percentiles (min, max, 25th, 50th, and 75th percentile values)
histogram of pixel values
In addition to the above stats, the ImageStats class also defines a hash for each image to be used
in conjunction with the Duplicates class in order to identify duplicate images.
Tutorials
To see how the ImageStats class can be used while doing exploratory data analysis, check out the EDA Part 1 tutorial.
How To Guides
There is a how-to guide that applies to the ImageStats class.
DataEval API
- class dataeval.metrics.ImageStats(flags: ImageHash | ImageProperty | ImageVisuals | ImageStatistics | Sequence[ImageHash | ImageProperty | ImageVisuals | ImageStatistics] | None = None)
Calculates various image property statistics
- Parameters:
flags ([ImageHash | ImageProperty | ImageStatistics | ImageVisuals], default None) – Metric(s) to calculate for each image per channel - calculates all metrics if None
- compute() Dict[str, Any]
Computes the specified measures on the cached values
- Returns:
Dictionary results of the specified measures
- Return type:
Dict[str, Any]
- evaluate(images: TBatch) Dict[str, Any]
Calculate metric results given a single batch of images
- reset()
Resets the internal metric cache
- update(images: Iterable[_SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]]) None
Updates internal metric cache for later calculation
- Parameters:
batch (Sequence) – Sequence of images to be processed