dataeval.metrics.estimators.divergence
======================================

.. py:function:: dataeval.metrics.estimators.divergence(data_a, data_b, method = 'FNN')

   Calculates the :term`divergence` and any errors between the datasets

   :param data_a: A dataset in an ArrayLike format to compare.
                  Function expects the data to have 2 dimensions, N number of observations in a P-dimensionial space.
   :type data_a: ArrayLike, shape - (N, P)
   :param data_b: A dataset in an ArrayLike format to compare.
                  Function expects the data to have 2 dimensions, N number of observations in a P-dimensionial space.
   :type data_b: ArrayLike, shape - (N, P)
   :param method: Method used to estimate dataset :term:`divergence<Divergence>`
   :type method: Literal["MST, "FNN"], default "FNN"

   :returns: The divergence value (0.0..1.0) and the number of differing edges between the datasets
   :rtype: DivergenceOutput

   .. note::

      The divergence value indicates how similar the 2 datasets are
      with 0 indicating approximately identical data distributions.

   .. warning::

      MST is very slow in this implementation, this is unlike matlab where
      they have comparable speeds
      Overall, MST takes ~25x LONGER!!
      Source of slowdown:
      conversion to and from CSR format adds ~10% of the time diff between
      1nn and scipy mst function the remaining 90%

   .. rubric:: References

   For more information about this divergence, its formal definition,
   and its associated estimators see https://arxiv.org/abs/1412.6534.

   .. rubric:: Examples

   Evaluate the datasets:

   >>> divergence(datasetA, datasetB)
   DivergenceOutput(divergence=0.28, errors=36)