dataeval.core.compute_stats¶

dataeval.core.compute_stats(data, *, boxes=None, stats=ImageStats.ALL, per_image=True, per_target=True, per_channel=False, progress_callback=None)¶

Compute specified statistics on a set of images, optionally within bounding boxes.

Parameters:¶

data : Iterable[ArrayLike] | Dataset[ArrayLike] | Dataset[tuple[ArrayLike, Any, Any]]¶: An iterable of images or a Dataset to compute statistics on.
boxes : Iterable[Iterable[BoxLike] | None] | None¶: Optional bounding boxes for each image. If None, defers to the data provided.
stats : ImageStats, default ImageStats.ALL¶: Flags indicating which statistics to compute. Can combine multiple flags using bitwise OR (|). Dependencies are resolved automatically.
per_image : bool, default True¶: If True, compute statistics for entire images. When boxes are provided and per_image=True, statistics are computed for both the full image and each box (if per_target=True).
per_target : bool, default True¶: If True and boxes are provided, compute statistics for each bounding box. Has no effect when boxes is None. At least one of per_image or per_target must be True.
per_channel : bool, default False¶: If True, compute per-channel statistics. If False, statistics are aggregated across all channels.
progress_callback : ProgressCallback or None, default None¶: Callback to report progress during calculation. Called after each image is processed with the current image count and total number of images (if known).

Returns:¶

Mapping containing computed statistics and metadata:

source_index: Sequence[SourceIndex] - SourceIndex objects with image/box/channel info
object_count: Sequence[int] - Object counts per image
invalid_box_count: Sequence[int] - Invalid box counts per image
image_count: int - Total number of images processed
stats: Mapping[str, Sequence[Any]] - Mapping of statistic names to sequences of computed values

Output is sorted by (item_index, box_index, channel_index) ascending, with None values appearing before 0.

Return type:¶

StatsResult

Examples

Compute all statistics:

>>> from dataeval.flags import ImageStats
>>> stats = compute_stats(images, boxes=boxes)

Compute specific statistics:

>>> stats = compute_stats(images, boxes=boxes, stats=ImageStats.PIXEL_MEAN | ImageStats.VISUAL_BRIGHTNESS)

Use convenience groups:

>>> stats = compute_stats(images, boxes=boxes, stats=ImageStats.PIXEL | ImageStats.VISUAL)
>>> stats = compute_stats(images, boxes=boxes, stats=ImageStats.PIXEL_BASIC, per_channel=True)

Compute statistics only for bounding boxes (not full images):

>>> stats = compute_stats(images, boxes=boxes, per_image=False, per_target=True)

Compute statistics for full images only (ignore boxes):

>>> stats = compute_stats(images, boxes=boxes, per_image=True, per_target=False)

Compute statistics for both full images and boxes with per-channel breakdown:

>>> stats = compute_stats(images, boxes=boxes, per_image=True, per_target=True, per_channel=True)