dataeval.metrics.bias.completeness¶
- dataeval.metrics.bias.completeness(embeddings, quantiles)¶
Calculate the fraction of boxes in a grid defined by quantiles that contain at least one data point. Also returns the center coordinates of each empty box.
- Parameters:¶
- Returns:¶
fraction_filled: float - Fraction of boxes that contain at least one data point
empty_box_centers: List[np.ndarray] - List of coordinates for centers of empty boxes
- Return type:¶
- Raises:¶
ValueError – If embeddings are too high-dimensional (>10)
ValueError – If there are too many quantiles (>2)
ValueError – If embedding is invalid shape
Example
>>> embs = np.array([[1, 0], [0, 1], [1, 1]]) >>> quantiles = 1 >>> result = completeness(embs, quantiles) >>> result.fraction_filled 0.75Reference¶
This implementation is based on https://arxiv.org/abs/2002.03147.
[1] Byun, Taejoon, and Sanjai Rayadurgam. “Manifold for Machine Learning Assurance.” Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering