Skip to content

Better "merge" defaults for Dataset.__setitem__ and Dataset.update #10585

@dcherian

Description

@dcherian

What is your issue?

In dataset_update_method we call merge_core without specific values for join and compat.

return merge_core(
[dataset, other],
priority_arg=1,
indexes=dataset.xindexes,
combine_attrs="override",
)

These values default to join="outer" and compat="broadcast_equals"

def merge_core(
objects: Iterable[CoercibleMapping],
compat: CompatOptions = "broadcast_equals",
join: JoinOptions = "outer",
combine_attrs: CombineAttrsOptions = "override",
priority_arg: int | None = None,
explicit_coords: Iterable[Hashable] | None = None,
indexes: Mapping[Any, Any] | None = None,
fill_value: object = dtypes.NA,
skip_align_args: list[int] | None = None,
) -> _MergeResult:
.

We should probably migrate to join="left" and compat="override". Using override will avoid any potentially expensive equality comparisons for non-indexed coordiante variables. I am not sure how compat="override" interacts with priority_arg.

cc @shoyer

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions