dataeval.metrics.bias.coverage ============================== .. py:function:: dataeval.metrics.bias.coverage(embeddings, radius_type = 'adaptive', k = 20, percent = 0.01) Class for evaluating :term:`coverage` and identifying images/samples that are in undercovered regions. :param embeddings: A dataset in an ArrayLike format. Function expects the data to have 2 dimensions, N number of observations in a P-dimesionial space. :type embeddings: ArrayLike, shape - (N, P) :param radius_type: The function used to determine radius. :type radius_type: {"adaptive", "naive"}, default "adaptive" :param k: Number of observations required in order to be covered. [1] suggests that a minimum of 20-50 samples is necessary. :type k: int, default 20 :param percent: Percent of observations to be considered uncovered. Only applies to adaptive radius. :type percent: float, default 0.01 :returns: Array of uncovered indices, critical value radii, and the radius for coverage :rtype: CoverageOutput :raises ValueError: If length of :term:`embeddings` is less than or equal to k :raises ValueError: If radius_type is unknown .. note:: Embeddings should be on the unit interval [0-1]. .. rubric:: Example >>> results = coverage(embeddings) >>> results.indices array([447, 412, 8, 32, 63]) >>> results.critical_value 0.8459038956941765 Reference --------- This implementation is based on https://dl.acm.org/doi/abs/10.1145/3448016.3457315. [1] Seymour Sudman. 1976. Applied sampling. Academic Press New York (1976).