dataeval.utils.thresholds.AdaptiveThreshold

class dataeval.utils.thresholds.AdaptiveThreshold(multiplier=3.5, *, lower_multiplier=_UNSET, upper_multiplier=_UNSET, lower_limit=None, upper_limit=None)

Threshold using tail-weighted Double-MAD for robust asymmetric outlier detection.

Computes separate dispersion metrics for data below and above the median (Double-MAD), producing naturally asymmetric bounds. On each side the multiplier is automatically scaled up when the tail is heavier than a normal distribution, preventing over-flagging on skewed or heavy-tailed metrics while keeping tight bounds on well-behaved data.

Tail-weight adjustment: for each half, the ratio of the 90th-percentile deviation to the MAD is compared against the expected ratio for normal data (~1.91). When the observed ratio exceeds this, the effective multiplier is increased by log1p(excess), widening the bound on that side only.

Point-mass handling: when both half-MADs are zero (>50% of data at one value), a gap-ratio test determines whether non-mode values form a smooth continuous tail (wider bounds via non-mode MAD) or discrete categorical jumps (tight bounds via global mean absolute deviation).

Parameters:
multiplier : float or None, default 3.0

Symmetric multiplier applied to both bounds. Overridden per-side by lower_multiplier / upper_multiplier when provided.

lower_multiplier : float or None

Override for the lower bound: median - lower_multiplier * tail_factor * scale_left. None means no lower bound.

upper_multiplier : float or None

Override for the upper bound: median + upper_multiplier * tail_factor * scale_right. None means no upper bound.

Examples

>>> symmetric = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
>>> t = AdaptiveThreshold(2.0)
>>> lower, upper = t(symmetric)
>>> skewed = np.array([1.0, 1.0, 1.0, 2.0, 10.0, 50.0])
>>> lower, upper = t(skewed)
classmethod parse_object(obj)

Instantiate a Threshold subclass from a dictionary.

The dictionary must contain a "type" key whose value matches a registered threshold_type string (e.g. "constant", "standard_deviation", "zscore"). The remaining key/value pairs are forwarded as keyword arguments to the matching subclass constructor.

Parameters:
obj : dict[str, Any]

Dictionary representation of a threshold. The "type" key is popped from the dict during parsing.

Returns:

An instance of the matching Threshold subclass.

Return type:

Threshold

Raises:

ValueError – If "type" is missing or does not match any registered threshold subclass.