dataeval.utils.thresholds.ModifiedZScoreThreshold¶

class dataeval.utils.thresholds.ModifiedZScoreThreshold(multiplier=3.5, *, lower_multiplier=_UNSET, upper_multiplier=_UNSET, lower_limit=None, upper_limit=None)¶

Threshold based on modified z-score (median absolute deviation (MAD)).

Uses median and MAD for robust outlier detection. The modified z-score is: 0.6745 * |x - median| / MAD

Falls back to mean(|x - median|) when MAD <= EPSILON.

Parameters:¶

multiplier : float or None, default 3.5¶: Symmetric multiplier applied to both bounds. Overridden per-side by lower_multiplier / upper_multiplier when provided.
lower_multiplier : float or None¶: Override for the lower bound. None means no lower bound.
upper_multiplier : float or None¶: Override for the upper bound. None means no upper bound.

Examples

>>> data = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 100.0])
>>> t = ModifiedZScoreThreshold(3.5)
>>> lower, upper = t(data)

classmethod parse_object(obj)¶

Instantiate a Threshold subclass from a dictionary.

The dictionary must contain a "type" key whose value matches a registered threshold_type string (e.g. "constant", "standard_deviation", "zscore"). The remaining key/value pairs are forwarded as keyword arguments to the matching subclass constructor.

Parameters:¶

obj : dict[str, Any]¶: Dictionary representation of a threshold. The "type" key is popped from the dict during parsing.

Returns:¶

An instance of the matching Threshold subclass.

Return type:¶

Threshold

Raises:¶

ValueError – If "type" is missing or does not match any registered threshold subclass.