dataeval.detectors.linters.Outliers
===================================

.. py:class:: dataeval.detectors.linters.Outliers(use_dimension = True, use_pixel = True, use_visual = True, outlier_method = 'modzscore', outlier_threshold = None)

   Calculates statistical Outliers of a dataset using various statistical tests applied to each image

   :param outlier_method: Statistical method used to identify outliers
   :type outlier_method: ["modzscore" | "zscore" | "iqr"], optional - default "modzscore"
   :param outlier_threshold: Threshold value for the given ``outlier_method``, above which data is considered an outlier.
                             Uses method specific default if `None`
   :type outlier_threshold: float, optional - default None

   .. attribute:: stats

      Various stats output classes that hold the value of each metric for each image

      :type: tuple[DimensionStatsOutput, PixelStatsOutput, VisualStatsOutput]

   .. seealso:: :term:`Duplicates`

   .. note::

      There are 3 different statistical methods:

      - zscore
      - modzscore
      - iqr

      | The z score method is based on the difference between the data point and the mean of the data.
          The default threshold value for `zscore` is 3.
      | Z score = :math:`|x_i - \mu| / \sigma`

      | The modified z score method is based on the difference between the data point and the median of the data.
          The default threshold value for `modzscore` is 3.5.
      | Modified z score = :math:`0.6745 * |x_i - x̃| / MAD`, where :math:`MAD` is the median absolute deviation

      | The interquartile range method is based on the difference between the data point and
          the difference between the 75th and 25th qartile. The default threshold value for `iqr` is 1.5.
      | Interquartile range = :math:`threshold * (Q_3 - Q_1)`

   .. rubric:: Examples

   Initialize the Outliers class:

   >>> outliers = Outliers()

   Specifying an outlier method:

   >>> outliers = Outliers(outlier_method="iqr")

   Specifying an outlier method and threshold:

   >>> outliers = Outliers(outlier_method="zscore", outlier_threshold=3.5)


   .. py:method:: evaluate(data)

      Returns indices of Outliers with the issues identified for each

      :param data: A dataset of images in an ArrayLike format
      :type data: Iterable[ArrayLike], shape - (C, H, W)

      :returns: Output class containing the indices of outliers and a dictionary showing
                the issues and calculated values for the given index.
      :rtype: OutliersOutput

      .. rubric:: Example

      Evaluate the dataset:

      >>> outliers = Outliers(outlier_method="zscore", outlier_threshold=3.5)
      >>> results = outliers.evaluate(outlier_images)
      >>> list(results.issues)
      [10, 12]
      >>> results.issues[10]
      {'skew': -3.906, 'kurtosis': 13.266, 'entropy': 0.2128, 'contrast': 1.25, 'zeros': 0.05493}


   .. py:method:: from_stats(stats: OutlierStatsOutput | dataeval.metrics.stats.datasetstats.DatasetStatsOutput) -> OutliersOutput[IndexIssueMap]
                  from_stats(stats: Sequence[OutlierStatsOutput]) -> OutliersOutput[list[IndexIssueMap]]

      Returns indices of Outliers with the issues identified for each

      :param stats: The output(s) from a dimensionstats, pixelstats, or visualstats metric
                    analysis or an aggregate DatasetStatsOutput
      :type stats: OutlierStatsOutput | DatasetStatsOutput | Sequence[OutlierStatsOutput]

      :returns: Output class containing the indices of outliers and a dictionary showing
                the issues and calculated values for the given index.
      :rtype: OutliersOutput

      .. seealso:: :obj:`dimensionstats`, :obj:`pixelstats`, :obj:`visualstats`

      .. rubric:: Example

      Evaluate the dataset:

      >>> outliers = Outliers(outlier_method="zscore", outlier_threshold=3.5)
      >>> results = outliers.from_stats([stats1, stats2])
      >>> len(results)
      2
      >>> results.issues[0]
      {10: {'skew': -3.906, 'kurtosis': 13.266, 'entropy': 0.2128}, 12: {'std': 0.00536, 'var': 2.87e-05, 'skew': -3.906, 'kurtosis': 13.266, 'entropy': 0.2128}}
      >>> results.issues[1]
      {}