dataeval.metrics.bias.diversity
===============================

.. py:function:: dataeval.metrics.bias.diversity(metadata, method = 'simpson')

   Compute :term:`diversity<Diversity>` and classwise diversity for discrete/categorical variables and,
   through standard histogram binning, for continuous variables.

   We define diversity as a normalized form of the inverse Simpson diversity index.

   diversity = 1 implies that samples are evenly distributed across a particular factor
   diversity = 0 implies that all samples belong to one category/bin

   :param metadata: Preprocessed metadata from :func:`dataeval.utils.metadata.preprocess`
   :type metadata: Metadata

   .. note::

      - The expression is undefined for q=1, but it approaches the Shannon entropy in the limit.
      - If there is only one category, the diversity index takes a value of 0.

   :returns: Diversity index per column of self.data or each factor in self.names and
             classwise diversity [n_class x n_factor]
   :rtype: DiversityOutput

   .. rubric:: Example

   Compute Simpson diversity index of metadata and class labels

   >>> div_simp = diversity(metadata, method="simpson")
   >>> div_simp.diversity_index
   array([0.6       , 0.80882353, 1.        , 0.8       ])

   >>> div_simp.classwise
   array([[0.5       , 0.8       , 0.8       ],
          [0.63043478, 0.97560976, 0.52830189]])

   Compute Shannon diversity index of metadata and class labels

   >>> div_shan = diversity(metadata, method="shannon")
   >>> div_shan.diversity_index
   array([0.81127812, 0.9426312 , 1.        , 0.91829583])

   >>> div_shan.classwise
   array([[0.68260619, 0.91829583, 0.91829583],
          [0.81443569, 0.99107606, 0.76420451]])

   .. seealso:: :obj:`scipy.stats.entropy`