dataeval.utils.merge

dataeval.utils.merge(metadata: collections.abc.Iterable[collections.abc.Mapping[str, Any]], *, return_dropped: True, return_numpy: False = False, ignore_lists: bool = False, fully_qualified: bool = False, targets_per_image: collections.abc.Sequence[int] | None = None, image_index_key: str = '_image_index') tuple[dict[str, list[Any]], dict[str, list[str]]]
dataeval.utils.merge(metadata: collections.abc.Iterable[collections.abc.Mapping[str, Any]], *, return_dropped: False = False, return_numpy: False = False, ignore_lists: bool = False, fully_qualified: bool = False, targets_per_image: collections.abc.Sequence[int] | None = None, image_index_key: str = '_image_index') dict[str, list[Any]]
dataeval.utils.merge(metadata: collections.abc.Iterable[collections.abc.Mapping[str, Any]], *, return_dropped: True, return_numpy: True, ignore_lists: bool = False, fully_qualified: bool = False, targets_per_image: collections.abc.Sequence[int] | None = None, image_index_key: str = '_image_index') tuple[dict[str, numpy.typing.NDArray[Any]], dict[str, list[str]]]
dataeval.utils.merge(metadata: collections.abc.Iterable[collections.abc.Mapping[str, Any]], *, return_dropped: False = False, return_numpy: True, ignore_lists: bool = False, fully_qualified: bool = False, targets_per_image: collections.abc.Sequence[int] | None = None, image_index_key: str = '_image_index') dict[str, numpy.typing.NDArray[Any]]

Merges a collection of metadata dictionaries into a single flattened dictionary of keys and values.

Nested dictionaries are flattened, and lists are expanded. Nested lists are dropped as the expanding into multiple hierarchical trees is not supported. The function adds an internal “_image_index” key to the metadata dictionary used by the Metadata class.

Parameters:
metadata : Iterable[Mapping[str, Any]]

Iterable collection of metadata dictionaries to flatten and merge

return_dropped : bool, default False

Option to return a dictionary of dropped keys and the reason(s) for dropping

return_numpy : bool, default False

Option to return results as lists or NumPy arrays

ignore_lists : bool, default False

Option to skip expanding lists within metadata

fully_qualified : bool, default False

Option to return dictionary keys full qualified instead of minimized

targets_per_image : Sequence[int] or None, default None

Number of targets for each image metadata entry

image_index_key : str, default "_image_index"

User provided metadata key which maps the metadata entry to the source image.

Returns:

  • dict[str, list[Any]] | dict[str, NDArray[Any]] – A single dictionary containing the flattened data as lists or NumPy arrays

  • dict[str, list[str]], Optional – Dictionary containing dropped keys and reason(s) for dropping

Note

Nested lists of values and inconsistent keys are dropped in the merged metadata dictionary

Example

>>> list_metadata = [{"common": 1, "target": [{"a": 1, "b": 3, "c": 5}, {"a": 2, "b": 4}], "source": "example"}]
>>> reorganized_metadata, dropped_keys = merge(list_metadata, return_dropped=True)
>>> reorganized_metadata
{'common': [1, 1], 'a': [1, 2], 'b': [3, 4], 'source': ['example', 'example'], '_image_index': [0, 0]}
>>> dropped_keys
{'target_c': ['inconsistent_key']}