dataeval.quality.OutliersOutput

class dataeval.quality.OutliersOutput(data, *, calculation_results=None, outlier_threshold=None, cluster_stats=None, cluster_threshold=None, dataset_steps=None)

Output class for Outliers lint detector.

DataFrame of outlier issues with columns:

  • dataset_index: int - Index of the originating dataset (only present for multi-dataset output)

  • item_index: int - Index of the outlier item (local to each dataset)

  • target_index: int | None - Index of the target/detection within the item (None for item-level outliers). This column is omitted when all outliers are item-level (all target_index values would be None).

  • channel_index: int | None - Index of the image channel (None for aggregated stats). This column is omitted when all stats are aggregated across channels.

  • metric_name: str - Name of the metric that flagged this image/target

  • metric_value: float - Value of the metric for this image/target

calculation_results

The original calculation result(s) from compute_stats(). Used internally for re-detection via classwise(), itemwise(), and with_threshold().

Type:

StatsResult or Sequence[StatsResult] or None

outlier_threshold

Threshold configuration used to detect outliers. Preserved across re-detection calls unless overridden via with_threshold().

Type:

ThresholdLike or dict or None

cluster_stats

Pre-computed cluster statistics for cluster-based outlier re-detection via with_threshold().

Type:

ClusterStats or None

cluster_threshold

Threshold configuration used for cluster-based outlier detection. Preserved across re-detection calls unless overridden via with_threshold().

Type:

ThresholdLike or None

dataset_steps

Cumulative dataset boundaries for multi-dataset index remapping. None for single-dataset output.

Type:

list[int] or None

aggregate_by_class(metadata)

Return a Polars DataFrame summarizing outliers per class and metric.

Creates a pivot table showing the count of outlier images for each combination of class and metric. Includes a Total row showing the total number of outliers per metric across all classes, and a Total column showing the total number of outliers per class across all metrics.

Parameters:
metadata : MetadataLike

Metadata object containing class labels and image-to-class mappings for the dataset.

Returns:

DataFrame with columns:

  • class_name: cat - Name of the class

  • <metric_name>: int - Count of outliers for each metric (one column per metric)

  • Total: int - Total outlier count for the class across all metrics

The last row is “Total” showing the sum across all classes for each metric. Rows are sorted by Total in descending order (excluding the Total row).

Return type:

pl.DataFrame

Raises:

ValueError – If the issues contain multiple DataFrames (from multiple datasets).

Examples

>>> from dataeval import Metadata
>>> from dataeval.flags import ImageStats
>>> from dataeval.quality import Outliers
>>> outliers = Outliers(flags=ImageStats.VISUAL, outlier_threshold="modzscore")
>>> results = outliers.evaluate(dataset)
>>> metadata = Metadata(dataset)
>>> summary = results.aggregate_by_class(metadata)
>>> summary
shape: (4, 6)
┌────────────┬────────────┬──────────┬──────────┬───────────┬───────┐
│ class_name ┆ brightness ┆ contrast ┆ darkness ┆ sharpness ┆ Total │
│ ---        ┆ ---        ┆ ---      ┆ ---      ┆ ---       ┆ ---   │
│ cat        ┆ u32        ┆ u32      ┆ u32      ┆ u32       ┆ u32   │
╞════════════╪════════════╪══════════╪══════════╪═══════════╪═══════╡
│ person     ┆ 2          ┆ 2        ┆ 2        ┆ 2         ┆ 8     │
│ plane      ┆ 2          ┆ 2        ┆ 2        ┆ 2         ┆ 8     │
│ boat       ┆ 1          ┆ 1        ┆ 1        ┆ 1         ┆ 4     │
│ Total      ┆ 5          ┆ 5        ┆ 5        ┆ 5         ┆ 20    │
└────────────┴────────────┴──────────┴──────────┴───────────┴───────┘
aggregate_by_item()

Return a Polars DataFrame summarizing outliers per item (item_index, target_index pair) and metric.

Creates a pivot table showing whether each item is flagged by each metric (1 if flagged, 0 if not). Includes a Total column showing the total number of metrics that flagged each item.

Returns:

DataFrame with columns:

  • item_index: int - Item identifier

  • target_index: int or None - Target identifier (Only with per_target outliers)

  • <metric_name>: int - Binary indicator (1 or 0) for each metric

  • count: int - Total number of metrics that flagged this item

Return type:

pl.DataFrame

Raises:

ValueError – If the issues contain multiple DataFrames (from multiple datasets).

Examples

>>> outliers = Outliers(outlier_threshold=("modzscore", 3.0))
>>> results = outliers.evaluate(dataset, per_target=True)
>>> summary = results.aggregate_by_item()
>>> summary.head(10)
shape: (10, 14)
┌────────────┬──────────────┬────────────┬──────────┬───┬─────┬─────┬───────┬───────┐
│ item_index ┆ target_index ┆ brightness ┆ contrast ┆ … ┆ std ┆ var ┆ zeros ┆ Total │
│ ---        ┆ ---          ┆ ---        ┆ ---      ┆   ┆ --- ┆ --- ┆ ---   ┆ ---   │
│ i64        ┆ i64          ┆ u32        ┆ u32      ┆   ┆ u32 ┆ u32 ┆ u32   ┆ u32   │
╞════════════╪══════════════╪════════════╪══════════╪═══╪═════╪═════╪═══════╪═══════╡
│ 0          ┆ null         ┆ 0          ┆ 0        ┆ … ┆ 0   ┆ 0   ┆ 1     ┆ 1     │
│ 2          ┆ null         ┆ 0          ┆ 0        ┆ … ┆ 0   ┆ 0   ┆ 1     ┆ 1     │
│ 7          ┆ null         ┆ 1          ┆ 1        ┆ … ┆ 1   ┆ 1   ┆ 0     ┆ 8     │
│ 7          ┆ 0            ┆ 1          ┆ 1        ┆ … ┆ 1   ┆ 1   ┆ 0     ┆ 8     │
│ 11         ┆ null         ┆ 1          ┆ 1        ┆ … ┆ 1   ┆ 1   ┆ 0     ┆ 8     │
│ 11         ┆ 0            ┆ 1          ┆ 1        ┆ … ┆ 1   ┆ 1   ┆ 0     ┆ 8     │
│ 18         ┆ null         ┆ 1          ┆ 1        ┆ … ┆ 1   ┆ 1   ┆ 0     ┆ 8     │
│ 18         ┆ 0            ┆ 1          ┆ 1        ┆ … ┆ 1   ┆ 1   ┆ 0     ┆ 8     │
│ 18         ┆ 1            ┆ 1          ┆ 1        ┆ … ┆ 1   ┆ 1   ┆ 0     ┆ 8     │
│ 19         ┆ 0            ┆ 0          ┆ 0        ┆ … ┆ 0   ┆ 0   ┆ 0     ┆ 2     │
└────────────┴──────────────┴────────────┴──────────┴───┴─────┴─────┴───────┴───────┘
aggregate_by_metric()

Return a Polars DataFrame summarizing outlier counts per metric.

Returns:

DataFrame with columns:

  • metric_name: str - Name of the metric

  • Total: int - Number of images flagged by this metric

Return type:

pl.DataFrame

Examples

>>> outliers = Outliers(flags=ImageStats.PIXEL, outlier_threshold="zscore")
>>> results = outliers.evaluate(dataset)
>>> summary = results.aggregate_by_metric()
>>> summary
shape: (4, 2)
┌─────────────┬───────┐
│ metric_name ┆ Total │
│ ---         ┆ ---   │
│ cat         ┆ u32   │
╞═════════════╪═══════╡
│ entropy     ┆ 4     │
│ mean        ┆ 4     │
│ std         ┆ 4     │
│ var         ┆ 4     │
└─────────────┴───────┘
classwise(metadata)

Re-detect outliers using per-class thresholds.

Computes outlier thresholds within each class separately rather than globally. This catches within-class anomalies that global detection misses, and avoids false positives where a sample is only unusual because its class is inherently different.

For image classification datasets, each image is assigned to its class. For object detection datasets, target-level stats use the target’s class, while image-level stats for images with multiple classes fall back to global detection.

Parameters:
metadata : MetadataLike

Metadata object containing class labels.

Returns:

New output with per-class detected outliers. Supports all the same aggregation methods (aggregate_by_class, aggregate_by_metric, etc.).

Return type:

OutliersOutput

Raises:

ValueError – If this output was not created from an evaluation with stored statistics.

Examples

>>> outliers = Outliers(flags=ImageStats.PIXEL, outlier_threshold="modzscore")
>>> result = outliers.evaluate(dataset)
>>> classwise_result = result.classwise(metadata)
>>> classwise_result.aggregate_by_class(metadata)
shape: (5, 9)
┌────────────┬─────────┬──────────┬──────┬───┬─────┬─────┬───────┬───────┐
│ class_name ┆ entropy ┆ kurtosis ┆ mean ┆ … ┆ std ┆ var ┆ zeros ┆ Total │
│ ---        ┆ ---     ┆ ---      ┆ ---  ┆   ┆ --- ┆ --- ┆ ---   ┆ ---   │
│ cat        ┆ u32     ┆ u32      ┆ u32  ┆   ┆ u32 ┆ u32 ┆ u32   ┆ u32   │
╞════════════╪═════════╪══════════╪══════╪═══╪═════╪═════╪═══════╪═══════╡
│ plane      ┆ 2       ┆ 0        ┆ 3    ┆ … ┆ 2   ┆ 2   ┆ 4     ┆ 13    │
│ person     ┆ 2       ┆ 1        ┆ 2    ┆ … ┆ 2   ┆ 2   ┆ 1     ┆ 11    │
│ boat       ┆ 1       ┆ 0        ┆ 1    ┆ … ┆ 1   ┆ 1   ┆ 4     ┆ 8     │
│ car        ┆ 0       ┆ 0        ┆ 0    ┆ … ┆ 0   ┆ 0   ┆ 4     ┆ 4     │
│ Total      ┆ 5       ┆ 1        ┆ 6    ┆ … ┆ 5   ┆ 5   ┆ 13    ┆ 36    │
└────────────┴─────────┴──────────┴──────┴───┴─────┴─────┴───────┴───────┘
data()

Return the output data as a polars DataFrame.

itemwise()

Re-detect outliers using global thresholds (across all items).

This is the inverse of classwise() – it re-runs detection without per-class grouping, using the full dataset distribution for thresholds.

Returns:

New output with globally detected outliers.

Return type:

OutliersOutput

Raises:

ValueError – If this output was not created from an evaluation with stored statistics.

Examples

>>> outliers = Outliers(flags=ImageStats.PIXEL)
>>> result = outliers.evaluate(dataset, per_class=True, metadata=metadata)
>>> global_result = result.itemwise()
meta()

Metadata about the execution of the function or method for the Output class.

Return type:

ExecutionMetadata

with_threshold(outlier_threshold=_UNSET, cluster_threshold=_UNSET)

Re-detect outliers using a different threshold configuration.

Re-runs detection on the stored statistics with the new threshold, without recomputing the underlying image statistics. This enables quick sensitivity analysis and threshold experimentation.

Can be chained with classwise() for per-class detection with a different threshold.

Parameters:
outlier_threshold : ThresholdLike, dict, or None

New threshold configuration for stats-based outliers. Accepts the same formats as Outliers:

  • None: AdaptiveThreshold(3.5) (Double-MAD with asymmetric bounds)

  • float: symmetric multiplier for modified z-score

  • str: named threshold type ("zscore", "iqr", etc.)

  • tuple: named threshold with bounds, e.g. ("zscore", 2.5)

  • Threshold: fully configured threshold

  • Mapping[str, ThresholdLike]: per-metric thresholds

cluster_threshold : ThresholdLike or None

New threshold configuration for cluster-based outlier detection. Accepts the same formats as outlier_threshold. Only applies when cluster statistics are stored from evaluate() or from_clusters().

Returns:

New output with re-detected outliers using the new threshold.

Return type:

OutliersOutput

Raises:

ValueError – If no arguments are provided, or if this output was not created from an evaluation with stored statistics or cluster stats.

Examples

>>> outliers = Outliers(flags=ImageStats.PIXEL)
>>> result = outliers.evaluate(dataset)

Loosen the threshold:

>>> lenient = result.with_threshold(4.0)

Switch to IQR method:

>>> iqr_result = result.with_threshold("iqr")

Per-metric overrides:

>>> custom = result.with_threshold({"brightness": 2.0, "contrast": ("zscore", 3.0)})

Chain with classwise:

>>> per_class = result.classwise(metadata).with_threshold(2.0)

Adjust cluster threshold:

>>> strict_clusters = result.with_threshold(cluster_threshold=1.5)
property outliers : TOutliers

Outlier items as a mapping of index to flagged metric names.

When per_target=False:

  • Single-dataset: dict[int, list[str]] keyed by item_index

  • Multi-dataset: dict[int, dict[int, list[str]]] outer key is dataset_index

When per_target=True:

  • Single-dataset: dict[SourceIndex, list[str]] keyed by SourceIndex

  • Multi-dataset: dict[int, dict[SourceIndex, list[str]]] outer key is dataset_index