dataeval.extractors.OnnxExtractor¶

class dataeval.extractors.OnnxExtractor(model, transforms=None, output_name=None, flatten=True)¶

Extracts embeddings via ONNX Runtime with lazy model loading.

Encapsulates ONNX-specific logic for feature extraction:

Model loading from ONNX files or in-memory bytes
Automatic GPU/CPU provider selection with fallback
Transform pipeline
Output layer selection for multi-output models

Implements the FeatureExtractor protocol.

Parameters:¶

model : str, Path, or bytes¶: Path to the ONNX model file, or serialized model bytes from to_encoding_model().
transforms : Transform or Sequence[Transform] or None, default None¶: Preprocessing transforms to apply before encoding. When None, uses raw images.
output_name : str or None, default None¶: Name of the output to extract embeddings from. When None, uses the first output. Required for models with multiple outputs.
flatten : bool, default True¶: If True, flattens outputs with more than 2 dimensions to (N, D) shape. If False, preserves the original output shape.

Example

Basic usage with a model file:

>>> from dataeval import Embeddings
>>> from dataeval.extractors import OnnxExtractor
>>>
>>> extractor = OnnxExtractor("model.onnx")
>>> embeddings = Embeddings(dataset, extractor=extractor, batch_size=32)

Notes

The extractor expects images in CHW format (channels, height, width).
For models with multiple outputs, use output_name to specify which output contains embeddings.
The model is loaded lazily on first use.
Requires onnxruntime or onnxruntime-gpu to be installed.

property output_name : str | None¶

Return the output name for extraction, if set.