dataeval.protocols.FeatureExtractor¶
- class dataeval.protocols.FeatureExtractor¶
Protocol defining a feature extraction function for drift detection.
Feature extractors transform arbitrary input data types into arrays suitable for drift detection. This enables drift detection on non-array inputs such as datasets, metadata, or raw model outputs.
Common use cases include: - Extracting model prediction uncertainties from raw data - Computing embeddings from a neural network layer - Extracting statistical features from metadata - Converting structured data to numeric representations
Example
Creating a feature extractor for model uncertainties:
>>> import torch >>> import torch.nn as nn >>> from dataeval.protocols import FeatureExtractor >>> >>> class UncertaintyExtractor: ... def __init__(self, model: nn.Module) -> None: ... self.model = model ... ... def __call__(self, data: Any, /) -> Array: ... # Get model predictions ... with torch.no_grad(): ... preds = self.model(torch.tensor(data)) ... # Compute uncertainty as entropy ... probs = torch.softmax(preds, dim=-1) ... uncertainty = -(probs * torch.log(probs + 1e-10)).sum(dim=-1) ... return uncertainty.numpy() >>> >>> model = nn.Linear(10, 3) >>> extractor = UncertaintyExtractor(model) >>> isinstance(extractor, FeatureExtractor) TrueCreating a feature extractor for metadata:
>>> class MetadataExtractor: ... def __call__(self, metadata_list: list, /) -> Array: ... import numpy as np ... ... # Extract statistics from metadata ... features = [[m.brightness, m.contrast] for m in metadata_list] ... return np.array(features) >>> >>> extractor = MetadataExtractor() >>> isinstance(extractor, FeatureExtractor) True