dataeval.extractors.ClasswiseUncertaintyExtractor

class dataeval.extractors.ClasswiseUncertaintyExtractor(scores, preds_type='logits', normalize=True, threshold=0.99)

Per-class prediction entropy distributions for detection models.

Groups detections by predicted class and returns one uncertainty array per class. A detection is assigned to every class whose rescaled (sigmoid) confidence is at least threshold times its maximum, so a detection may contribute to multiple classes.

__call__ returns a dict, so this is not a drift feature extractor: pick a single class’s array out of the dict and feed that to a detector. (It will still pass isinstance(x, FeatureExtractor) at runtime, which only checks for __call__; do not pass it to a drift detector directly.)

To run both per-instance (UncertaintyExtractor) and per-class uncertainty on the same data without paying for inference twice, wrap the scores extractor in a caching Embeddings and share that one instance between both extractors.

Parameters:
scores : FeatureExtractor

Producer of per-detection class scores (n_detections, n_classes).

preds_type : "probs" or "logits", default "logits"

Format of the scores.

normalize : bool, default True

Normalize Shannon entropy by the maximum possible entropy.

threshold : float, default 0.99

Confidence ratio cutoff for class assignment. 1.0 enforces single-class (winner-take-all) assignment; lower values allow more classes per detection.

Example

>>> import numpy as np
>>> import torch.nn as nn
>>> from dataeval.extractors import TorchExtractor, ClasswiseUncertaintyExtractor
>>>
>>> model = nn.Linear(16, 10)
>>> scores = TorchExtractor(model, device="cpu", batch_size=8)
>>> extractor = ClasswiseUncertaintyExtractor(scores, preds_type="logits")
>>> per_class = extractor(np.random.randn(8, 16).astype(np.float32))
>>> isinstance(per_class, dict)
True