dataeval.extractors.OnnxExtractor¶
-
class dataeval.extractors.OnnxExtractor(model, transforms=
None, output_name=None, flatten=True)¶ Extracts embeddings via ONNX Runtime with lazy model loading.
Encapsulates ONNX-specific logic for feature extraction:
Model loading from ONNX files or in-memory bytes
Automatic GPU/CPU provider selection with fallback
Transform pipeline
Output layer selection for multi-output models
Implements the
FeatureExtractorprotocol.- Parameters:¶
- model : str, Path, or bytes¶
Path to the ONNX model file, or serialized model bytes from
to_encoding_model().- transforms : Transform or Sequence[Transform] or None, default None¶
Preprocessing transforms to apply before encoding. When None, uses raw images.
- output_name : str or None, default None¶
Name of the output to extract embeddings from. When None, uses the first output. Required for models with multiple outputs.
- flatten : bool, default True¶
If True, flattens outputs with more than 2 dimensions to (N, D) shape. If False, preserves the original output shape.
Example
Basic usage with a model file:
>>> from dataeval import Embeddings >>> from dataeval.extractors import OnnxExtractor >>> >>> extractor = OnnxExtractor("model.onnx") >>> embeddings = Embeddings(dataset, extractor=extractor, batch_size=32)Notes
The extractor expects images in CHW format (channels, height, width).
For models with multiple outputs, use
output_nameto specify which output contains embeddings.The model is loaded lazily on first use.
Requires
onnxruntimeoronnxruntime-gputo be installed.
- property output_name : str | None¶
Return the output name for extraction, if set.