dataeval.metadata.metadata_distance¶
- dataeval.metadata.metadata_distance(metadata1, metadata2)¶
Measures the feature-wise distance between two continuous metadata distributions and computes a p-value to evaluate its significance.
Uses the Earth Mover’s Distance and the Kolmogorov-Smirnov two-sample test, featurewise.
- Parameters:¶
- Returns:¶
A dictionary with keys corresponding to metadata feature names, and values that are KstestResult objects, as defined by scipy.stats.ks_2samp.
- Return type:¶
dict[str, KstestResult]
See also
Earth,Kolmogorov-SmirnovNote
This function only applies to the continuous data
Examples
>>> output = metadata_distance(metadata1, metadata2) >>> list(output) ['time', 'altitude'] >>> output["time"] MetadataKSResult(statistic=1.0, location=0.44354838709677413, dist=2.7, pvalue=0.0)