Skip to content

Fast path for Dictionary -> View cast for large types & cross cast #8985

@Jefffrey

Description

@Jefffrey

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

See original issue comment: https://github.com/apache/arrow-rs/pull/8912/changes#r2553987033

Describe the solution you'd like

Implement fast paths for casting from LargeUtf8/LargeBinary to Utf8View/BinaryView -> need to consider if offsets fit

Implement fast path for casting Utf8 -> BinaryView, Binary -> Utf8View -> need to consider if need to validate binary data for Binary -> Utf8View case

See:

// `unpack_dictionary` can handle Utf8View/BinaryView types, but incurs unnecessary data
// copy of the value buffer. Fast path which avoids copying underlying values buffer.
// TODO: handle LargeUtf8/LargeBinary -> View (need to check offsets can fit)
// TODO: handle cross types (String -> BinaryView, Binary -> StringView)
// (need to validate utf8?)
(Utf8, Utf8View) => view_from_dict_values::<K, Utf8Type, StringViewType>(
array.keys(),
array.values().as_string::<i32>(),
),
(Binary, BinaryView) => view_from_dict_values::<K, BinaryType, BinaryViewType>(
array.keys(),
array.values().as_binary::<i32>(),
),

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions