-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Labels
enhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelog
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
See original issue comment: https://github.com/apache/arrow-rs/pull/8912/changes#r2553987033
Describe the solution you'd like
Implement fast paths for casting from LargeUtf8/LargeBinary to Utf8View/BinaryView -> need to consider if offsets fit
Implement fast path for casting Utf8 -> BinaryView, Binary -> Utf8View -> need to consider if need to validate binary data for Binary -> Utf8View case
See:
arrow-rs/arrow-cast/src/cast/dictionary.rs
Lines 37 to 49 in 08dcc0b
| // `unpack_dictionary` can handle Utf8View/BinaryView types, but incurs unnecessary data | |
| // copy of the value buffer. Fast path which avoids copying underlying values buffer. | |
| // TODO: handle LargeUtf8/LargeBinary -> View (need to check offsets can fit) | |
| // TODO: handle cross types (String -> BinaryView, Binary -> StringView) | |
| // (need to validate utf8?) | |
| (Utf8, Utf8View) => view_from_dict_values::<K, Utf8Type, StringViewType>( | |
| array.keys(), | |
| array.values().as_string::<i32>(), | |
| ), | |
| (Binary, BinaryView) => view_from_dict_values::<K, BinaryType, BinaryViewType>( | |
| array.keys(), | |
| array.values().as_binary::<i32>(), | |
| ), |
Describe alternatives you've considered
Additional context
Metadata
Metadata
Assignees
Labels
enhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelog