dataeval.data.DetectionCrops¶
-
class dataeval.data.DetectionCrops(dataset, *, region=
'object', padding=0.0, min_size=1, square='expand', fill='mean')¶ Present an object-detection dataset’s ground-truth boxes as an image-classification dataset.
Each kept detection becomes one classification datum — a crop derived from the detection’s box, labeled (one-hot) with the detection’s class. The view satisfies the
ImageClassificationDatasetshape, so it drops intoEmbeddings,Coverage,ber_mst(), andBalance— every per-(image, label) tool — unchanged, with crops aligned 1:1 to labels by construction.This makes object-detection feasibility (the bounding-box-classification reduction behind Upper-Bound Average Precision (UAP)) and embedding-space coverage available to object-detection datasets without computing detection-level embeddings by hand. Crops are produced lazily on access, so an extractor’s transforms still handle resize/normalize and
Embeddingsstill batches and caches.- Parameters:¶
- dataset : ObjectDetectionDataset¶
The source object-detection dataset. Each datum is a MAITE
(image, ObjectDetectionTarget, metadata)3-tuple; images are read in(C, H, W)layout and boxes in absolute-pixel[x0, y0, x1, y1]format.- region : {"object", "context", "surround"}, default "object"¶
Which pixels each crop retains.
"object"and"context"both return the box widened bypadding("context"is the conventional name whenpaddingis large enough to bring in surroundings);"surround"returns the widened box with the original box masked tofill, leaving only the background ring — a probe for whether the surroundings alone predict the class."surround"requirespadding > 0. Only"object"is exercised by the shipped tutorials.- padding : float, default 0.0¶
Context margin added to each box, as a fraction of that box dimension: each side is extended by
paddingtimes the box’s width (left/right) or height (top/bottom).0.1grows a 100x200 box to 120x240, centered. Must be>= 0.- min_size : int, default 1¶
Drop detections whose box’s shorter side is below this many pixels (degenerate or zero-area boxes are always dropped). The number dropped is logged and exposed as
n_dropped; because dropping shrinks the view,len(crops)may be less than the source’s total detection count.- square : {"off", "expand", "pad"}, default "expand"¶
How a non-square crop is reconciled with a square model input.
"expand"squares the crop by extending the shorter side into real image pixels (shifting the window inward at image edges, falling back tofillonly for unavoidable overflow) — no synthetic fill for interior boxes, but it brings in real background, which for extreme aspect ratios can dilute thin objects."pad"squares by padding the shorter side with syntheticfill, keeping the embedding object-focused (prefer this for strict feasibility / BER)."off"leaves crops rectangular for the extractor’s resize to stretch (the prior default behavior).- fill : {"mean", "zero"}, default "mean"¶
Value for invented pixels — used by
square="pad", by edge overflow insquare="expand", and to mask the object inregion="surround"."mean"uses the per-crop, per-channel mean (normalization-agnostic, minimal contrast at the fill boundary);"zero"uses 0 (set this to your normalization mean if you need strict post-normalization neutrality).
- index2label¶
Mapping from class index to name, inherited from the source dataset.
Notes
Each datum’s third element is its metadata — a plain
dictat runtime, conforming toDatumMetadata— with the protocol-requiredidplus three keys added by this view that trace a crop back to its source detection:id(int) — the crop’s own identifier: its position in this view,0tolen(crops) - 1, aligned 1:1 with the labels and embeddings.source_id(int | str) — the source datum’s ownDatumMetadataid(not a positional index), so a crop flagged downstream (e.g. as low-dispersion or uncovered) still resolves to the correct image after the source has been filtered, sorted, or otherwise re-indexed by a view such asSelect(which renumbers positions but passes each datum’sidthrough unchanged). Falls back to the positional index for source data that omits the protocol-requiredid.target(int) — the detection’s index within its source image (its position in that image’s target arrays).box(list[float]) — the detection’s absolute-pixel[x0, y0, x1, y1]in the source image.
Examples
Wrap an object-detection dataset and run the classification-only tools on it:
>>> from dataeval.data import DetectionCrops >>> crops = DetectionCrops(od_dataset) >>> emb = Embeddings(crops, extractor=extractor, batch_size=64) >>> Coverage().evaluate(crops, embeddings=emb) # per-class dispersion over OD classes- property metadata : dataeval.protocols.DatasetMetadata¶
MAITE dataset metadata for the crop view (id and inherited index2label).